allow seperate learning rate "muon_lr" and "adam_lr" for muon optimizer (#7658)
This PR allows seperate learning rate for muon and adam part of the Muon
optimizer. Following up
https://github.com/deepspeedai/DeepSpeed/issues/7657
Signed-off-by: Guokai Ma <guokai.ma@intel.com>
Co-authored-by: Olatunji Ruwase <tunji.ruwase@snowflake.com>