vllm
b5b57e30 - [AMD][FP8] Using MI300 FP8 format on ROCm for block_quant (#12134)

Comment changes are shownComment changes are hidden
Commit
145 days ago
[AMD][FP8] Using MI300 FP8 format on ROCm for block_quant (#12134) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
Author
Parents
  • vllm/model_executor/layers/quantization
    • File
      fp8.py
    • utils
      • File
        fp8_utils.py