transformers
ec2f3198 - Restore gpt_oss rope_scaling reconstruction + minimax_m2 scoring_func mapping; minor GGUFQuantizedTensor perf

Commit

26 days ago

Restore gpt_oss rope_scaling reconstruction + minimax_m2 scoring_func mapping; minor GGUFQuantizedTensor perf Sergereview flagged two regressions: * gpt_oss GGUFs ship rope scaling under `gpt-oss.rope.scaling.*` metadata keys; HF `GptOssConfig` expects them rebuilt into `config.rope_scaling`. Restored. * minimax_m2 GGUFs ship `expert_gating_func` as an int while HF `MiniMaxM2Config` expects `scoring_func` as a string (`{0:"none",1:"softmax",2:"sigmoid"}`). Restored. Also tightens `GGUFQuantizedTensor` for ~10% real-world load speedup: short-circuit `tensor[...]` (no-op for already-loaded torch bytes), default `non_blocking=True` on `.to()`, and narrow `__torch_function__` re-wrap to only `Tensor.to`.

References

#44794 - Refacto GGUF weight conversion

#45975 - GGUF: optional Metal dequant fast path via kernels-community

Author

ArthurZucker

Committer

ArthurZucker

Parents

b8832746

transformers ec2f3198 - Restore gpt_oss rope_scaling reconstruction + minimax_m2 scoring_func mapping; minor GGUFQuantizedTensor perf

transformers
ec2f3198 - Restore gpt_oss rope_scaling reconstruction + minimax_m2 scoring_func mapping; minor GGUFQuantizedTensor perf