Skip to content

sweeps and train on hip/rocm/amdgpu, for 5.0#573

Draft
sozforex wants to merge 1 commit into
PufferAI:5.0from
sozforex:rocm_for_50
Draft

sweeps and train on hip/rocm/amdgpu, for 5.0#573
sozforex wants to merge 1 commit into
PufferAI:5.0from
sozforex:rocm_for_50

Conversation

@sozforex
Copy link
Copy Markdown
Contributor

@sozforex sozforex commented May 23, 2026

Alternative to #562 for the 5.0 branch,
tested only a short breakout sweep and eval on an amd gpu.

made with a lot of help from codex gpt-5.5.

@sozforex sozforex changed the title sweeps and train on hip/rocm/amdgpu sweeps and train on hip/rocm/amdgpu, for 5.0 May 23, 2026
@jsuarez5341
Copy link
Copy Markdown
Contributor

Willing to merge if greatly cleaned up. There's a lot of random sweeps etc stuff here not related to AMD support and env changes as well. Also would have to have at least decent enough perf on AMD to be worth supporting

@sozforex
Copy link
Copy Markdown
Contributor Author

sozforex commented May 25, 2026

I've apparently messed up a rebase when moving this from 4.0 to 5.0 branch, so there have been some wrong changes [which has not affected my test sweep/train on breakout]. Sorry for that.

About the stuff related to sweeps in this PR - there has been a problem with sweep_obj [used for early stopping] being passed between processes when protein sweep is on amd gpu - torch has special handling for this on cuda, but not rocm] - this PR changes how info related to early stopping is passed around.

It has decent performance on a consumer amd gpu [RX 6850M XT], but I have not tested it on anything else.

@sozforex sozforex marked this pull request as draft May 26, 2026 00:44
@sozforex sozforex force-pushed the rocm_for_50 branch 5 times, most recently from 6b9cee4 to 6b59ae4 Compare May 26, 2026 10:39
@sozforex sozforex marked this pull request as ready for review May 26, 2026 11:15
@sozforex sozforex marked this pull request as draft May 26, 2026 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants