DeepSpeed
No Muon optimizer for embeding and lm_head layer
#7641
Merged

No Muon optimizer for embeding and lm_head layer #7641

sfc-gh-truwase merged 1 commit into master from gma/auto_muon
delock
delock delock requested a review from loadams loadams 81 days ago
delock delock requested a review from tjruwase tjruwase 81 days ago
delock delock force pushed from 6d331b3e to a59a4528 81 days ago
delock filter out embed layer and lm_head layer from Muon optimizer
a59a4528
sfc-gh-truwase
sfc-gh-truwase approved these changes on 2025-10-22
sfc-gh-truwase sfc-gh-truwase merged 67b365af into master 81 days ago
sfc-gh-truwase sfc-gh-truwase deleted the gma/auto_muon branch 81 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone