[Blog] Muon Optimizer Support in DeepSpeed #7962
delock
force pushed
from
62714543
to
5da5fad6
62 days ago
delock
marked this pull request as ready for review 61 days ago
delock
force pushed
from
d1b1497a
to
79325337
61 days ago
delock
force pushed
from
9aaa30af
to
1ff51eb1
42 days ago
delock
force pushed
from
773f05fc
to
52c9c296
42 days ago
PKUWZP
commented
on 2026-05-03
delock
force pushed
from
7d945589
to
454eb1a2
35 days ago
delock
force pushed
from
b9f133f8
to
d4f93344
29 days ago
Muon optimizer blog draft
8aa8f4f7
add contributor list
b79f2280
fix checkboxes
40d7eb61
expand memory analysis
4a3f2aae
trim down
a0f8653f
remove memory data
315055bc
fix formatting
b3a27209
delock
force pushed
from
d4f93344
to
b78d616b
26 days ago
fix gramma
e64b7aa3
Add convergence experiment result and fix typos in Muon blog
feddb764
Add training configuration caption to convergence chart
65c0010e
Update Muon blog with measured convergence and memory data
35f8e764
Update Muon blog future plan: mark ZeRO stage 3 and Gram NS as done
8d034c0f
Add Muon pretraining convergence advantage to What is Muon section
bb164d0c
Revamp future plan into What's Next with active roadmap tone
3dcc5a3a
Add GLM-5 as Muon adopter and fix What's Next roadmap
c1f89b16
Add Muon blog to Latest News in README and docs landing page
201ca711
Refine Muon blog: convergence results, LR tuning guide, and formattin…
d954e232
Update Muon blog: convergence results, citation fixes, and DeepSeek-V4
0085c95a
Add Peng Du (@pengdurice) to Muon blog contributors
36f78dbf
Remove eval loss curve from Muon blog
09018d55
Update Muon blog: final experiment results with tuned learning rate
49be0447
Fix metric count in Muon blog: 3 out of 4
b3c0d129
Fix improvement numbers in Muon blog: use absolute pp difference
84f6a87c
Update release date from April to May
e7275eb1
Reorder Muon blog above SDMA in README
a665878d
Add SDMA entry to docs/index.md
1eaa8009
delock
force pushed
from
4306f2d6
to
1eaa8009
26 days ago
Sync blog README with Google Doc edits
4f737d94
Merge branch 'master' into gma/muon_blog
f65df29e
delock
merged
de473091
into master 21 days ago
delock
deleted the gma/muon_blog branch 21 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub