[cuTile] Add rope/qwen2vl_mrope/kl_div/group_norm/multi_token_attention by xjmxyt · Pull Request #1269 · linkedin/Liger-Kernel

xjmxyt · 2026-06-26T08:35:30Z

cuTile implementations for group_norm, kl_div, llama4_rope, qwen2vl_mrope, rope, sparsemax, tiled_mlp, and multi_token_attention, dispatched via LIGER_KERNEL_IMPL=cutile. Includes three-way (liger_triton / liger_cutile / torch|huggingface) speed+memory benchmark data on B200 generated with the open-source pip nvidia-cuda-tileiras 13.3.36.

Summary

Testing Done

Hardware Type:
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

cuTile implementations for group_norm, kl_div, llama4_rope, qwen2vl_mrope, rope, sparsemax, tiled_mlp, and multi_token_attention, dispatched via LIGER_KERNEL_IMPL=cutile. Includes three-way (liger_triton / liger_cutile / torch|huggingface) speed+memory benchmark data on B200 generated with the open-source pip nvidia-cuda-tileiras 13.3.36. Notable kernel choices: - rope/qwen2vl_mrope: stay in input dtype, drop redundant .contiguous() copies - kl_div: drop scale constexpr to avoid per-iter JIT recompile - group_norm: fp32 stats for numerical parity - multi_token_attention: conv-backward runs under the same cuDNN heuristic as the Triton path (no forced cudnn.benchmark) for apples-to-apples comparison Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

xjmxyt force-pushed the jinmanx/add_kernel_v2 branch from 13e5472 to 5fb9475 Compare June 26, 2026 08:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[cuTile] Add rope/qwen2vl_mrope/kl_div/group_norm/multi_token_attention#1269

[cuTile] Add rope/qwen2vl_mrope/kl_div/group_norm/multi_token_attention#1269
xjmxyt wants to merge 1 commit into
linkedin:mainfrom
xjmxyt:jinmanx/add_kernel_v2

xjmxyt commented Jun 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

xjmxyt commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing Done

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

xjmxyt commented Jun 26, 2026 •

edited

Loading