alexm-redhat
changed the title Cutlass MLA decode with unrestricted head_dim (can be < 128) which allows TP as well Cutlass MLA decode with unrestricted num_heads (can be < 128) which allows TP as well156 days ago
mgoin
changed the title Cutlass MLA decode with unrestricted num_heads (can be < 128) which allows TP as well SM100 Cutlass MLA decode with unrestricted num_heads (< 128) for DeepSeek TP152 days ago
Login to write a write a comment.
Login via GitHub