vllm
c188749b - [ROCm] Support MLA with nhead<16 and FP8 KV cache for TP=8 (Kimi K2.5/Linear) (#35850)

Commit
5 days ago
[ROCm] Support MLA with nhead<16 and FP8 KV cache for TP=8 (Kimi K2.5/Linear) (#35850) Signed-off-by: Li <chuali@amd.com>
Author
Parents
Loading