DeepSpeed
df59f203 - allow seperate learning rate "muon_lr" and "adam_lr" for muon optimizer (#7658)

Commit
50 days ago
allow seperate learning rate "muon_lr" and "adam_lr" for muon optimizer (#7658) This PR allows seperate learning rate for muon and adam part of the Muon optimizer. Following up https://github.com/deepspeedai/DeepSpeed/issues/7657 Signed-off-by: Guokai Ma <guokai.ma@intel.com> Co-authored-by: Olatunji Ruwase <tunji.ruwase@snowflake.com>
Author
Parents
Loading