Fix loading of Qwen3 FP8 (#43494)
* Fix loading of Qwen3 FP8
The Qwen3 MoE config was missing the mapping attribute for the num_expert_local
config variable which made it impossible to load FP8 quantized models, due
to the following exception:
```
Traceback (most recent call last):
File ".../exps/train-qwen3-lora.py", line 4, in <module>
base_model = AutoModelForCausalLM.from_pretrained('Qwen/Qwen3-30B-A3B-Thinking-2507-FP8')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../transformers/src/transformers/models/auto/auto_factory.py", line 372, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../transformers/src/transformers/modeling_utils.py", line 4075, in from_pretrained
hf_quantizer.preprocess_model(
File ".../transformers/src/transformers/quantizers/base.py", line 167, in preprocess_model
self._process_model_before_weight_loading(model, **kwargs)
File ".../transformers/src/transformers/quantizers/quantizer_finegrained_fp8.py", line 106, in _process_model_before_weight_loading
model = replace_with_fp8_linear(
^^^^^^^^^^^^^^^^^^^^^^^^
File ".../transformers/src/transformers/integrations/finegrained_fp8.py", line 617, in replace_with_fp8_linear
new_module = FP8Expert(
^^^^^^^^^^
File ".../transformers/src/transformers/integrations/finegrained_fp8.py", line 496, in __init__
self.num_experts = config.num_local_experts
^^^^^^^^^^^^^^^^^^^^^^^^
File ".../transformers/src/transformers/configuration_utils.py", line 164, in __getattribute__
return super().__getattribute__(key)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'Qwen3MoeConfig' object has no attribute 'num_local_experts'
```
A small reproducer is added in the form of a unit test.
* Update tests/quantization/finegrained_fp8/test_fp8.py
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
---------
Co-authored-by: nemo <git@ningu.net>
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>