DeepSpeed
Add Gram Newton-Schulz orthogonalization for Muon optimizer
#7953
Open

Add Gram Newton-Schulz orthogonalization for Muon optimizer #7953

delock wants to merge 7 commits into master from gma/gram_muon
delock
delock delock requested a review from tjruwase tjruwase 3 days ago
delock delock requested a review from tohtana tohtana 3 days ago
delock delock requested a review from loadams loadams 3 days ago
chatgpt-codex-connector
chatgpt-codex-connector commented on 2026-04-03
delock delock force pushed from 4fe6a0b4 to d17212ef 3 days ago
delock Add Gram Newton-Schulz iteration for Muon optimizer
381d8b7a
delock docs: add ns_method parameter to Muon optimizer documentation
e9beb2de
delock fix: correct Gram Newton-Schulz reference URL
d17212ef
delock Use accelerator API for dtype selection in Newton-Schulz iterations
54930203
delock Fix non-contiguous tensor output from Gram NS for tall matrices
e5de42ce
delock Fold transpose into matmul in Gram NS for tall matrices
a6cf6b69
delock Use fused addmm and eliminate eye allocation in Gram NS
61095611
PKUWZP PKUWZP requested a review from PKUWZP PKUWZP 2 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone