DeepSpeed
Communication Optimization for Large-Scale Training
#4695
Merged

Communication Optimization for Large-Scale Training #4695

jeffra merged 28 commits into master from comm-opt
RezaYazdaniAminabadi
add allgather opt
0fff3505
add all-reduce optimization for large-scale traning with many GPUs
6d0b4c2c
efficient communication layout for training MoE architecture at large…
970015bb
fuse the all_to_all for the seq-parallel into one and use all_to_all_…
caba320a
add readme for the comm-optimization blog
9937e42e
remove images
6f8e4acf
add missing figures
20017b59
fix convergence figure
cbfc6813
reduce fig width
67b8aa1d
adjust fig width
35f6db37
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from jeffra jeffra 2 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from tjruwase tjruwase 2 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from mrwyattii mrwyattii 2 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from awan-10 awan-10 2 years ago
some minor edits
c1a5448f
some more editting
11200382
adjust fig width
c76f0c37
adjust fig width
d4042112
tjruwase
tjruwase commented on 2023-11-17
tjruwase
tjruwase commented on 2023-11-17
tjruwase
tjruwase commented on 2023-11-17
tjruwase
tjruwase commented on 2023-11-17
tjruwase
tjruwase commented on 2023-11-17
tjruwase
tjruwase commented on 2023-11-17
address Tunji's comments
22ef6183
change intro to make the sections more connected
f8756a1d
RezaYazdaniAminabadi Merge branch 'master' into comm-opt
3e6cda77
remove the convergence part to be added in a following update
5da60a1a
Merge branch 'comm-opt' of github.com:microsoft/DeepSpeed into comm-opt
f05b03b7
add some minor fixes
919cdab7
RezaYazdaniAminabadi Merge branch 'master' into comm-opt
50e70716
fix a bug
f00e08b8
formatting
f1a83563
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from cmikeh2 cmikeh2 2 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from arashb arashb 2 years ago
tjruwase
tjruwase commented on 2023-11-21
tjruwase
tjruwase commented on 2023-11-21
tjruwase
tjruwase commented on 2023-11-21
RezaYazdaniAminabadi Update blogs/comm-opt/README.md
80e5b005
RezaYazdaniAminabadi Update blogs/comm-opt/README.md
3af542b1
RezaYazdaniAminabadi Update blogs/comm-opt/README.md
d0c45d37
fix clang-format
7a88b21e
tjruwase
tjruwase approved these changes on 2023-11-21
RezaYazdaniAminabadi Merge branch 'master' into comm-opt
471e34c3
jeffra
jeffra approved these changes on 2023-11-21
jeffra jeffra merged 2afa1c7f into master 2 years ago
jeffra jeffra deleted the comm-opt branch 2 years ago
GeneZC
RezaYazdaniAminabadi
GeneZC
RezaYazdaniAminabadi
CurryRice233
CurryRice233 commented on 2024-01-12

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone