vllm
SM100 Cutlass MLA decode with unrestricted num_heads (< 128) for DeepSeek TP
#20769
Merged

SM100 Cutlass MLA decode with unrestricted num_heads (< 128) for DeepSeek TP #20769

alexm-redhat merged 2 commits into main from mla_fi_prefill_and_decode
alexm-redhat
alexm-redhat alexm-redhat requested a review from WoosukKwon WoosukKwon 156 days ago
alexm-redhat alexm-redhat requested a review from robertgshaw2-redhat robertgshaw2-redhat 156 days ago
alexm-redhat alexm-redhat requested a review from njhill njhill 156 days ago
alexm-redhat alexm-redhat requested a review from ywang96 ywang96 156 days ago
alexm-redhat alexm-redhat requested a review from comaniac comaniac 156 days ago
alexm-redhat alexm-redhat requested a review from tlrmchlsmth tlrmchlsmth 156 days ago
alexm-redhat alexm-redhat requested a review from LucasWilkinson LucasWilkinson 156 days ago
github-actions
alexm-redhat alexm-redhat requested a review from mgoin mgoin 156 days ago
mergify mergify added documentation
mergify mergify added ci/build
mergify mergify added v1
mergify
mergify mergify added needs-rebase
gemini-code-assist
gemini-code-assist commented on 2025-07-10
gemini-code-assist
gemini-code-assist commented on 2025-07-10
alexm-redhat alexm-redhat force pushed 156 days ago
mergify mergify removed needs-rebase
alexm-redhat alexm-redhat force pushed 156 days ago
alexm-redhat alexm-redhat force pushed 156 days ago
mgoin
mgoin commented on 2025-07-10
alexm-redhat alexm-redhat force pushed 156 days ago
alexm-redhat alexm-redhat force pushed 156 days ago
alexm-redhat alexm-redhat force pushed 156 days ago
alexm-redhat alexm-redhat force pushed to 1bcb4efc 156 days ago
alexm-redhat alexm-redhat changed the title Cutlass MLA decode with unrestricted head_dim (can be < 128) which allows TP as well Cutlass MLA decode with unrestricted num_heads (can be < 128) which allows TP as well 156 days ago
mergify
mergify mergify added needs-rebase
alexm-redhat alexm-redhat force pushed from 1bcb4efc to 32e44816 155 days ago
mergify mergify removed needs-rebase
alexm-redhat alexm-redhat force pushed from 32e44816 155 days ago
alexm-redhat alexm-redhat force pushed 155 days ago
alexm-redhat alexm-redhat assigned alexm-redhat alexm-redhat 155 days ago
alexm-redhat alexm-redhat force pushed 155 days ago
alexm-redhat alexm-redhat force pushed to c071eb5b 155 days ago
alexm-redhat alexm-redhat force pushed from c071eb5b 152 days ago
alexm-redhat alexm-redhat force pushed 152 days ago
mgoin
mgoin approved these changes on 2025-07-14
alexm-redhat [Attention] MLA - cutlass decode with unresticted num_heads
44da059e
alexm-redhat alexm-redhat force pushed to 44da059e 152 days ago
mgoin mgoin changed the title Cutlass MLA decode with unrestricted num_heads (can be < 128) which allows TP as well SM100 Cutlass MLA decode with unrestricted num_heads (< 128) for DeepSeek TP 152 days ago
mgoin mgoin added deepseek
mgoin mgoin added performance
mgoin mgoin added ready
alexm-redhat alexm-redhat enabled auto-merge (squash) 152 days ago
mgoin Merge branch 'main' into mla_fi_prefill_and_decode
3ad13158
alexm-redhat alexm-redhat merged 8cdc3712 into main 152 days ago
alexm-redhat alexm-redhat deleted the mla_fi_prefill_and_decode branch 152 days ago
LucasWilkinson
LucasWilkinson commented on 2025-07-15
LucasWilkinson
LucasWilkinson commented on 2025-07-15
LucasWilkinson
LucasWilkinson commented on 2025-07-15
jeejeelee
zou3519
mgoin
LucasWilkinson

Login to write a write a comment.

Login via GitHub

Assignees
Labels
Milestone