Restore gpt_oss rope_scaling reconstruction + minimax_m2 scoring_func mapping; minor GGUFQuantizedTensor perf
Sergereview flagged two regressions:
* gpt_oss GGUFs ship rope scaling under `gpt-oss.rope.scaling.*`
metadata keys; HF `GptOssConfig` expects them rebuilt into
`config.rope_scaling`. Restored.
* minimax_m2 GGUFs ship `expert_gating_func` as an int while HF
`MiniMaxM2Config` expects `scoring_func` as a string
(`{0:"none",1:"softmax",2:"sigmoid"}`). Restored.
Also tightens `GGUFQuantizedTensor` for ~10% real-world load speedup:
short-circuit `tensor[...]` (no-op for already-loaded torch bytes),
default `non_blocking=True` on `.to()`, and narrow
`__torch_function__` re-wrap to only `Tensor.to`.