transformers
57e84139 - [Qwen3.5 MoE] Add _tp_plan to ForConditionalGeneration (#45124)

Commit
28 days ago
[Qwen3.5 MoE] Add _tp_plan to ForConditionalGeneration (#45124) [Qwen3.5 MoE] Add `_tp_plan` to `Qwen3_5MoeForConditionalGeneration` The VL wrapper class was missing `_tp_plan`, so `lm_head` was not sharded when using `tp_plan="auto"`. The text-only `ForCausalLM` already had this; this aligns the conditional-generation (VL) variant.
Parents
Loading