DeepSpeed
42a8eaa7 - Auto convert moe param groups (#5354)

Commit
1 year ago
Auto convert moe param groups (#5354) When using frameworks like HF Accelerate with MoE models in HF there's an issue when DeepSpeed is creating the optimizer where we have no way to automatically create the compatible MoE param groups. This PR detects if no client optimizer is set and model_parameters are passed to DeepSpeed that they are either MoE compatible or makes them MoE compatible automatically. This was never an issue previously since (1) MoE hasn't really been tested outside MDS and (2) MDS manually converts the weight-decay param groups into being MoE compatible before deepspeed.initialize. The error that is triggered if the param groups are not MoE compatible is triggered here: https://github.com/microsoft/DeepSpeed/blob/cc897ecf15fdac5437fa4a2743154dc6c1749da4/deepspeed/runtime/zero/stage_1_and_2.py#L610-L612 Tagging @tohtana and @ykim362 to help review --------- Co-authored-by: Jeff Rasley <jeff.rasley@snowflake.com>
Author
Parents
Loading