llama.cpp
079e5a45 - convert : support mixed-precision ModelOpt models with per-tensor NVFP4/FP8 quantization (#20539)

Commit
29 days ago
convert : support mixed-precision ModelOpt models with per-tensor NVFP4/FP8 quantization (#20539) * support mixed-precision ModelOpt models with per-tensor NVFP4/FP8 quantization * cleanup * fallback --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Author
Parents
Loading