vllm
8cdc3712 - SM100 Cutlass MLA decode with unrestricted num_heads (< 128) for DeepSeek TP (#20769)

Commit
148 days ago
SM100 Cutlass MLA decode with unrestricted num_heads (< 128) for DeepSeek TP (#20769) Signed-off-by: Alexander Matveev <amatveev@redhat.com>
Author
Parents
Loading