llama.cpp
f8d4abae - convert : support Qwen3.5/Qwen3.5 Moe NVFP4 and add input scales (#20505)

Commit
13 days ago
convert : support Qwen3.5/Qwen3.5 Moe NVFP4 and add input scales (#20505) * convert : fix Qwen3.5 NVFP4 conversion * Updated copilot concerns and rebased * move into _LinearAttentionVReorderBase and simplify * --flake * new_name not needed * Added input_scale to gguf * Fixed input_scale addition as tensor * Added input scale to loader and named _in_s * Update convert_hf_to_gguf.py Re-removed input_scale from aux cleanup Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Author
Parents
Loading