Override Transformers defaults by GGUF defaults (#42770)
* Override Transformers defaults by GGUF defaults
In some models, GGUF uses default or fixed values different from this
library. To integrate GGUF-based models without additional configuration,
we need some kind of compatibility layer.
This commit provides additional mapping to provide GGUF-specific default
values to initialize parameters in this library.
Currently, only fixed "norm_topk_prob" value of Qwen3 MoE (True) is
defined because (a) it differs from the default value of this library
(False) and (b) if this parameter is incorrectly set, it results in
almost completely garbled output.
Signed-off-by: Tsukasa OI <floss_llm@irq.a4lg.com>
* Apply suggestions from code review
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
---------
Signed-off-by: Tsukasa OI <floss_llm@irq.a4lg.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>