vllm
cef32104 - [FP8] Extend per-token-group quantization support to QuantFP8 (#24342)

Commit
111 days ago
[FP8] Extend per-token-group quantization support to QuantFP8 (#24342) Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com> Signed-off-by: Luka Govedič <lgovedic@redhat.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Luka Govedič <lgovedic@redhat.com>
Author
Parents
Loading