vllm
64862d10 - [ROCM][AMD][TRITON] Halving warps number for fw_prefill to reduce spilling (#12713)

Commit
308 days ago
[ROCM][AMD][TRITON] Halving warps number for fw_prefill to reduce spilling (#12713) Signed-off-by: Aleksandr Malyshev <maleksan@amd.com> Co-authored-by: Aleksandr Malyshev <maleksan@amd.com>
Author
Parents
Loading