Add Gram Newton-Schulz orthogonalization for Muon optimizer #7953
delock
force pushed
from
4fe6a0b4
to
d17212ef
54 days ago
Add Gram Newton-Schulz iteration for Muon optimizer
381d8b7a
docs: add ns_method parameter to Muon optimizer documentation
e9beb2de
fix: correct Gram Newton-Schulz reference URL
d17212ef
Use accelerator API for dtype selection in Newton-Schulz iterations
54930203
Fix non-contiguous tensor output from Gram NS for tall matrices
e5de42ce
Fold transpose into matmul in Gram NS for tall matrices
a6cf6b69
Use fused addmm and eliminate eye allocation in Gram NS
61095611
Merge branch 'master' into gma/gram_muon
dbc9ac9b
Merge branch 'master' into gma/gram_muon
83f18009
PKUWZP
approved these changes
on 2026-04-30
Merge branch 'master' into gma/gram_muon
a6b0c704
PKUWZP
merged
8a77f381
into master 27 days ago
PKUWZP
deleted the gma/gram_muon branch 27 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub