vllm
32aa74c0
- [ROCm][FP8][Kernel] FP8 quantization fused into Custom Paged Attention (#17139)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
349 days ago
[ROCm][FP8][Kernel] FP8 quantization fused into Custom Paged Attention (#17139) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
References
#17139 - [ROCm][FP8][Kernel] FP8 quantization fused into Custom Paged Attention
Author
gshtras
Parents
7377dd03
Loading