MacOS-only version with MPS support by fnachon · Pull Request #127 · gnina/libmolgrid

fnachon · 2026-03-31T12:14:48Z

This is a MacOS-only version created with Claude.
Dependencies must be installed with Homebrew.
Xcode command Line Tools is necessary to compile Metal kernels

Rewriting for support of Apple silicon

dkoes

This seems to replace CUDA with Metal, which is not what is desired. Can the choice of backend be a compile time switch?

fnachon · 2026-03-31T12:45:27Z

It may not be easy, but I can always try. I'll have a look at it and come back to you later.
Another option is to create a macOS-specific branch. Not ideal, but better than nothing.
Note that for gnina, I pointed to my fork for MacOS building.

fnachon · 2026-04-05T15:34:39Z

I restored CUDA compatibility alongside Metal, with a choice of backend at installation. So that you know, I couldn't test the CUDA side.

phillipweinstock · 2026-05-30T17:32:31Z

Hi @fnachon I traced your commits back from Gnina.
First off, you attached claudes brainstorming documentation so even if someone merged it in, it would drag all the junk in. Quite frankly this pull is not even possible to audit in the slightest. I commend you wanting to try and vibe code your way around CUDA, but even trained software engineers such as myself know that this isnt a viable solution.
Seeing as you don't want your work to goto waste, I suggest you keep your changes as a patch and follow this package upstream. Same with Gnina. That way you can continue working until someone else figures out how to swap out CUDA. Probably the best way to do it is first shim/wrap all CUDA logic into an abstracted interface, that way the backend becomes swappable instead of a mess of include statements and defines. Its still a very large job, your probably looking at a major rewrite of this library I.E the kind of thing the original authors would probably need a grant to do.

phillipweinstock · 2026-05-31T03:15:18Z

Actually I've just checked further. Just a little background I have not ever used CUDA before but just a glance at grid_maker.cpp Line 358, shows float3 datatypes everywhere + exposing CUDA datatypes via public facing API. Any serious effort to port this to metal or any other platform like HIPS (close to gold standard), will by definition not be backwards compatible a major architectural refactor is required.
Regarding CUDA datatypes, a quick step to start cleaning up the internals would be replacing the float3 and other CUDA specific data types with a abstracted Vec3 datatype.

.claude/settings.local.json is a per-developer permission allowlist with local filesystem paths; it has no business in version control.

The public API reused CUDA's own vector_types.h names (float3, uint2, ...) even in non-CUDA builds, since common.h just redefined structs with the same names when LIBMOLGRID_USE_CUDA was off. That makes the API look CUDA-coupled when it isn't. Replace the fallback with a single, unconditional Vec3/UVec2 pair plus a to_cuda() conversion helper for the CUDA kernel-launch boundary; downstream renames follow in subsequent commits.

Mechanical rename of float3/uint2 to Vec3/UVec2 across the public headers and their host (.cpp) implementations. The genuine CUDA __global__/__device__ kernel functions declared as friends in grid_maker.h keep native float3, since they're real device code compiled by nvcc, not part of this host API.

These Objective-C++ files are compiled by clang as ordinary host C++, not Metal Shading Language, so they follow the same Vec3/UVec2 rename as the rest of the host API. src/metal/shaders.metal is untouched: that's genuine MSL with its own native float3.

GridMaker/GridInterpolater/Transform's host-callable forward/backward methods now take Vec3 (matching their header declarations) and convert to native CUDA float3 at the actual kernel-launch boundary via to_cuda(). Free __global__/__device__ kernels that only do raw GPU buffer arithmetic keep native float3 throughout. Kernels that call Vec3-based shared host/device logic (Quaternion::rotate/transform, cart2grid/grid2cart) use Vec3 directly instead, since it's a trivial POD valid in both host and device code under nvcc. I don't have a CUDA toolchain to compile-test this locally (macOS); this needs verification via the project's CI or on a CUDA machine before merging upstream.

molgrid.float3 was a public Python class, not just an internal type, so the C++ rename alone would have broken existing user code and tests. Bind it as molgrid.Vec3 and alias float3 = Vec3 in python/__init__.py so molgrid.float3(...) and tuple call sites keep working unchanged.

Host, Metal (.mm), and CUDA (.cu) test files all call directly into the renamed public API (constructors, get_grid_dims, forward/backward, ...), so they need the same float3->Vec3 rename. test_transform.py is left unchanged: it exercises the molgrid.float3 back-compat alias as a regression check.

fnachon · 2026-06-30T21:45:29Z

Thanks for the detailed look, both points were fair and I've pushed fixes:

The .claude/ file — that was a local dev-tool config file (a permission allowlist referencing my own filesystem paths) that had no business in this diff. Removed and added .claude/ to .gitignore (0c522c7).

CUDA datatypes in the public API — also a fair point, and worth separating into two parts:

The non-CUDA build path already isolated CUDA via LIBMOLGRID_USE_CUDA (see common.h) — when CUDA is off, float3/uint2/etc. were just local structs, not the real CUDA types. So this was never the "every platform needs its own copy of CUDA logic" problem the rewrite framing suggested.
The actual bug was naming: those local structs reused CUDA's names, so the public API (grid_maker.h and friends) looked CUDA-coupled even when it wasn't — exactly what you ran into at grid_maker.cpp.

I've replaced that with a single backend-neutral Vec3/UVec2 (defined unconditionally in common.h, no #ifdef), used throughout the public C++ API, the Metal dispatcher layer, and the Python bindings (molgrid.Vec3, with molgrid.float3 kept as an alias for existing user code). Genuine CUDA float3 is now confined to the actual .cu kernel files and the kernel-launch boundary (one explicit to_cuda() conversion call per launch site); the real Metal Shading Language float3 in shaders.metal is untouched since that's a different compiler/language entirely.

Split into 6 commits so it's reviewable in stages (type definition → public API → Metal dispatcher → CUDA boundary → Python bindings → tests) rather than one large diff.

Caveat: I don't have a CUDA toolchain to compile-test the .cu changes locally (macOS). I built and ran the full C++/Metal/Python test suite here and confirmed no regressions versus the pre-change baseline (a couple of pre-existing failures — an unrelated Metal forward-pass numerical bug and missing torch/psutil in my env — are unchanged either way). The CUDA-side changes should get exercised by CI's CUDA compile step; happy to have someone with a CUDA box sanity-check before this lands anywhere.

ManagedGrid's CPU allocation placed a tiny header struct before the data pointer, which on the non-CUDA (Metal) build shrank to 1 byte and broke the page alignment that MTLBuffer's zero-copy wrapping requires. GPU dispatches (Transform, GridMaker, GridInterpolater) silently wrote into a buffer that was never visible to the CPU-side pointer, which is what was causing the apply_transform/Vec3 regression and several other GPU-path test failures. Fix: keep the header in the same allocation but place it after the data instead of before, so the data pointer stays exactly page-aligned. Also: - python/CMakeLists.txt: resolve Python symbols via -undefined,dynamic_lookup on macOS instead of linking libpython directly, since some distributions (e.g. conda) statically embed the interpreter rather than exposing it as a dylib; linking it directly pulled in a second CPython runtime and crashed on import. - torch_bindings.py: MolIterDataset.__setstate__ (used when DataLoader workers are spawned, which macOS does by default) reconstructed self.examples but never recomputed self._num_labels. - test_coordinateset.py: deduplicate two same-named test functions (the second was silently shadowing the first), and make the CoordinateSet leak-detection test robust to OpenBabel's legitimate one-time/lazy cache allocations instead of asserting byte-exact RSS equality. Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>

gninasrc (molgridder.cpp/h, torch_model.cpp) now uses libmolgrid's Vec3/UVec2 types, which only exist on the fnachon/libmolgrid fork (gnina/libmolgrid#127, not yet merged upstream). The non-Metal ExternalProject_Add was still pulling upstream gnina/libmolgrid, which has no Vec3 type, breaking the CUDA/Linux CI build. Point it at the fork like the Metal branch already does, until gnina#127 merges. Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>

Both come from CUDA-templated code now getting instantiated inside plain .cpp translation units (compiled by g++ directly), not just .cu files (compiled by nvcc, which is more permissive by default): - common.h: to_cuda() called make_float3(), which lives in <vector_functions.h>. nvcc implicitly includes that for .cu files; a host-only g++ compile of a .cpp file doesn't get it, so the name was undeclared. Build the ::float3 via aggregate init instead. - managed_grid.h: cudaMalloc(&gpu_info->gpu_ptr, ...) passes a float**/double** where cudaMalloc wants void**, an implicit pointer-to-pointer conversion that's illegal in strict C++. nvcc passes -fpermissive to its host-compiler stage for .cu files, masking this; plain g++ on a .cpp file does not. Add the explicit cast. Confirmed via the PR's own CI log (gnina/libmolgrid run 28481163579): GCC 11.4.0 + CUDA 12.6.68 hit exactly these two errors while compiling src/libmolgrid.cpp. Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>

fnachon added 2 commits March 30, 2026 12:51

Change for metal support for Apple Silicon

5a03f17

Rewriting for support of Apple silicon

Update README.md

fd6fcff

dkoes reviewed Mar 31, 2026

View reviewed changes

fnachon added 2 commits April 5, 2026 17:06

Restore upstream CUDA compatibility alongside Metal

8a22172

Regenerate docs and fix macOS doc builds

251dc0f

fnachon added 7 commits June 30, 2026 23:09

Stop tracking local Claude Code settings

0c522c7

.claude/settings.local.json is a per-developer permission allowlist with local filesystem paths; it has no business in version control.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MacOS-only version with MPS support#127

MacOS-only version with MPS support#127
fnachon wants to merge 13 commits into
gnina:masterfrom
fnachon:master

fnachon commented Mar 31, 2026

Uh oh!

dkoes left a comment

Uh oh!

fnachon commented Mar 31, 2026

Uh oh!

fnachon commented Apr 5, 2026

Uh oh!

phillipweinstock commented May 30, 2026

Uh oh!

phillipweinstock commented May 31, 2026

Uh oh!

fnachon commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

fnachon commented Mar 31, 2026

Uh oh!

dkoes left a comment

Choose a reason for hiding this comment

Uh oh!

fnachon commented Mar 31, 2026

Uh oh!

fnachon commented Apr 5, 2026

Uh oh!

phillipweinstock commented May 30, 2026

Uh oh!

phillipweinstock commented May 31, 2026

Uh oh!

fnachon commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants