transformers
3eaedc8d - Remove unused shared_expert_combination_strategy from cohere2_moe

Commit
29 days ago
Remove unused shared_expert_combination_strategy from cohere2_moe All published cohere2_moe checkpoints (the command-a-plus-05-2026 family) use the 'average' strategy; the 'sum' branch is dead. Drop the config field and always average the shared-expert output. Existing configs that still carry the key load unchanged (it is ignored).
Author
Parents
Loading