vllm
b5b57e30
- [AMD][FP8] Using MI300 FP8 format on ROCm for block_quant (#12134)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Hide Minimap (CTRL+M)
Commit
145 days ago
[AMD][FP8] Using MI300 FP8 format on ROCm for block_quant (#12134) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
References
#12134 - [AMD][FP8] Using MI300 FP8 format on ROCm for block_quant
Author
gshtras
Parents
54cacf00
Files
2
vllm/model_executor/layers/quantization
fp8.py
utils
fp8_utils.py
Loading