vllm
64862d10
- [ROCM][AMD][TRITON] Halving warps number for fw_prefill to reduce spilling (#12713)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
308 days ago
[ROCM][AMD][TRITON] Halving warps number for fw_prefill to reduce spilling (#12713) Signed-off-by: Aleksandr Malyshev <maleksan@amd.com> Co-authored-by: Aleksandr Malyshev <maleksan@amd.com>
References
#12713 - [ROCM][AMD][TRITON] Halving warps number for fw_prefill to reduce spilling
Author
maleksan85
Parents
b3a0d01e
Loading