DeepSpeed
2585881a - Make Muon optimizer easier to enable (#7555)

Commit
150 days ago
Make Muon optimizer easier to enable (#7555) The original Muon optimizer PR (https://github.com/deepspeedai/DeepSpeed/pull/7509) requires user to explicitly set `use_muon` flags in `model.parameters()`, as shown in test https://github.com/deepspeedai/DeepSpeed/blob/master/tests/unit/ops/muon/test_muon.py#L27 . This PR integrate setting of `use_muon` into DeepSpeed before engine initialization. This makes Muon optimizer easier to use. User only needs to change optimizer in `config.json` from `AdamW` to `Muon`, no need to change code. It will solve the following issue https://github.com/deepspeedai/DeepSpeed/issues/7552 --------- Signed-off-by: Ma, Guokai <guokai.ma@intel.com> Co-authored-by: Olatunji Ruwase <tunji.ruwase@snowflake.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com>
Author
Parents
Loading