DeepSpeed
zero3: SDMA allgather via mori (sdma_allgather)
#7999
Merged

zero3: SDMA allgather via mori (sdma_allgather) #7999

delock merged 37 commits into deepspeedai:master from inkcherry:sdma_ag_
inkcherry
inkcherry inkcherry requested a review from tjruwase tjruwase 37 days ago
inkcherry inkcherry requested a review from tohtana tohtana 37 days ago
inkcherry inkcherry requested a review from GuanhuaWang GuanhuaWang 37 days ago
chatgpt-codex-connector
chatgpt-codex-connector commented on 2026-05-07
add zero3 example
55b24f3e
enable sdma allgather
fbedb2fa
fix bug
ccb634eb
fix bug
e0eb5103
fix bug
6512ecfa
add test case
d5f8489e
fix bug
4b2d44d1
copy_output_to_user=True
f3a0d1bb
use same training sample
939cc0c3
add flops
ca01795b
add training log
33edc8a0
change to 2.7b
5eb18e87
copy_output_to_user: bool = False
f7d587d7
fix noncopy
4053ea1a
fix bug
72020df8
update
6b782d96
use real txt
fc415527
inkcherry zero3: route SDMA allgather through mori_cpp.AllGatherIntoTensor
2c5104c8
inkcherry zero3: drop CPU sync from SDMA Work.wait() to match RCCL semantics
f979a540
inkcherry zero3: add sdma_allgather end-to-end examples (GPT + Qwen3-32B)
5644ae32
inkcherry update readme
2f5eaa6a
inkcherry inkcherry force pushed from 29c7d0e9 to 2f5eaa6a 36 days ago
inkcherry readme: 2000-step loss curve plots (off vs on)
e7bbe36c
inkcherry qwen3 trainer: chunked wikitext loader + cleaner 2000-step loss curve
57d929df
inkcherry readme: drop perf annotation from GPT loss plot (loss-only figures)
cec6dbfb
inkcherry update readme
8f45f83c
delock
inkcherry comm: move SDMA allgather into TorchBackend as a transparent fast-path
606f309a
inkcherry sdma allgather: explicit opt-in env var + leave ZeRO-3 hot path untou…
35e1102c
inkcherry examples/sdma_allgather/README: fill in GPT peak memory cell
e7402be9
inkcherry inkcherry force pushed from 2a05f455 to e7402be9 31 days ago
inkcherry examples/sdma_allgather: drop accidentally-committed baseline_pp.py
0ce6bd2c
delock
delock commented on 2026-05-13
delock
delock commented on 2026-05-13
delock
delock commented on 2026-05-13
delock
delock requested changes on 2026-05-13
delock
delock commented on 2026-05-13
inkcherry update
e66d6640
inkcherry update comments
5e5c3fe4
delock
delock approved these changes on 2026-05-13
inkcherry sdma allgather: fix CI format checks
bedb5eb4
inkcherry update
cc505f94
inkcherry update readme
94153745
delock
inkcherry mori: move from deepspeed/runtime/comm to deepspeed/comm
4111556e
inkcherry Merge branch 'master' of github.com:deepspeedai/DeepSpeed into sdma_ag_
f0dc1f4c
inkcherry
inkcherry examples/sdma_allgather: fix CI format checks
24e83866
delock delock merged 66af8f03 into master 30 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone