vllm
8cdc3712
- SM100 Cutlass MLA decode with unrestricted num_heads (< 128) for DeepSeek TP (#20769)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
148 days ago
SM100 Cutlass MLA decode with unrestricted num_heads (< 128) for DeepSeek TP (#20769) Signed-off-by: Alexander Matveev <amatveev@redhat.com>
References
#20769 - SM100 Cutlass MLA decode with unrestricted num_heads (< 128) for DeepSeek TP
Author
alexm-redhat
Parents
61e20828
Loading