vllm
36b60baf - Fix EPLB + NVFP4 modelopt: use direct Parameter assignment for g1/g2_alphas

Commit

54 days ago

Fix EPLB + NVFP4 modelopt: use direct Parameter assignment for g1/g2_alphas PR #34646 uses replace_parameter() to register g1_alphas and g2_alphas, but these are never pre-registered in create_weights(), so replace_parameter's getattr(mod, name) raises AttributeError. Switch to direct nn.Parameter assignment (matching the compressed_tensors path in the same PR) so the parameters are properly registered on the layer. This ensures get_expert_weights() includes them in the EPLB rearrangement set, and the quant config holds a reference to the same tensor that EPLB modifies in-place. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>

Author

tlrmchlsmth

Parents

4f57aa65

vllm 36b60baf - Fix EPLB + NVFP4 modelopt: use direct Parameter assignment for g1/g2_alphas

vllm
36b60baf - Fix EPLB + NVFP4 modelopt: use direct Parameter assignment for g1/g2_alphas