Skip to content

Add minimal SM120 FP4 paged MQA support#55

Open
liz-badada wants to merge 1 commit into
sgl-project:devfrom
liz-badada:sm120-fp4-paged-mqa
Open

Add minimal SM120 FP4 paged MQA support#55
liz-badada wants to merge 1 commit into
sgl-project:devfrom
liz-badada:sm120-fp4-paged-mqa

Conversation

@liz-badada

Copy link
Copy Markdown

This PR simply cherry-picks the minimal subset of the original DeepGEMM PR deepseek-ai#324 required to enable the DeepSeek-V4 FP4 indexer on SM120. PR sgl-project/sglang#27059 builds on top of this change.

  • Includes only the necessary SM120 FP4 paged-MQA code and the associated integration changes.

Extract only the paged FP4 MQA path required by sgl-project/sglang#27059 from deepseek-ai#324. The upstream feature commit 9160119 is not independently cherry-pickable because it relies on earlier SM120 JIT, MMA, and scheduler refactors, so this commit carries only the leaf support missing from sgl/dev.

The supported contract is intentionally limited to next_n=1, 64 query heads, head_dim=128, block_kv=64, FP32 logits, 2-D context lengths, and no varlen indices. Dense MQA, FP8 MQA, BF16 logits, wider shape support, and SM121 family reuse are deliberately excluded.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants