DeepSpeed
zero3: SDMA allgather via mori (sdma_allgather)
#7999
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
37
Changes
View On
GitHub
zero3: SDMA allgather via mori (sdma_allgather)
#7999
delock
merged 37 commits into
deepspeedai:master
from
inkcherry:sdma_ag_
inkcherry
requested a review
from
tjruwase
37 days ago
inkcherry
requested a review
from
tohtana
37 days ago
inkcherry
requested a review
from
GuanhuaWang
37 days ago
chatgpt-codex-connector
commented on 2026-05-07
add zero3 example
55b24f3e
enable sdma allgather
fbedb2fa
fix bug
ccb634eb
fix bug
e0eb5103
fix bug
6512ecfa
add test case
d5f8489e
fix bug
4b2d44d1
copy_output_to_user=True
f3a0d1bb
use same training sample
939cc0c3
add flops
ca01795b
add training log
33edc8a0
change to 2.7b
5eb18e87
copy_output_to_user: bool = False
f7d587d7
fix noncopy
4053ea1a
fix bug
72020df8
update
6b782d96
use real txt
fc415527
zero3: route SDMA allgather through mori_cpp.AllGatherIntoTensor
2c5104c8
zero3: drop CPU sync from SDMA Work.wait() to match RCCL semantics
f979a540
zero3: add sdma_allgather end-to-end examples (GPT + Qwen3-32B)
5644ae32
update readme
2f5eaa6a
inkcherry
force pushed
from
29c7d0e9
to
2f5eaa6a
36 days ago
readme: 2000-step loss curve plots (off vs on)
e7bbe36c
qwen3 trainer: chunked wikitext loader + cleaner 2000-step loss curve
57d929df
readme: drop perf annotation from GPT loss plot (loss-only figures)
cec6dbfb
update readme
8f45f83c
comm: move SDMA allgather into TorchBackend as a transparent fast-path
606f309a
sdma allgather: explicit opt-in env var + leave ZeRO-3 hot path untou…
35e1102c
examples/sdma_allgather/README: fill in GPT peak memory cell
e7402be9
inkcherry
force pushed
from
2a05f455
to
e7402be9
31 days ago
examples/sdma_allgather: drop accidentally-committed baseline_pp.py
0ce6bd2c
delock
commented on 2026-05-13
delock
commented on 2026-05-13
delock
commented on 2026-05-13
delock
requested changes on 2026-05-13
delock
commented on 2026-05-13
update
e66d6640
update comments
5e5c3fe4
delock
approved these changes on 2026-05-13
sdma allgather: fix CI format checks
bedb5eb4
update
cc505f94
update readme
94153745
mori: move from deepspeed/runtime/comm to deepspeed/comm
4111556e
Merge branch 'master' of github.com:deepspeedai/DeepSpeed into sdma_ag_
f0dc1f4c
examples/sdma_allgather: fix CI format checks
24e83866
delock
merged
66af8f03
into master
30 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
delock
chatgpt-codex-connector
tjruwase
tohtana
GuanhuaWang
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub