DeepSpeed
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
#2999
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
39
Changes
View On
GitHub
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
#2999
tjruwase
merged 39 commits into
deepspeedai:master
from
YizhouZ:yizhou/fix
* try to fix broadcast error on multi-node training with ZeroStage3 a…
1f5a38ae
YizhouZ
requested a review
from
jeffra
2 years ago
YizhouZ
requested a review
from
tjruwase
2 years ago
YizhouZ
requested a review
from
samyam
2 years ago
YizhouZ
requested a review
from
mrwyattii
2 years ago
Merge branch 'master' into yizhou/fix
e2d27366
Merge branch 'master' into yizhou/fix
aad8c5ef
Merge branch 'master' into yizhou/fix
53d5414c
YizhouZ
changed the title
Try to fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
2 years ago
* fix format error
6db8e2e3
tjruwase
approved these changes on 2023-03-20
Merge branch 'master' into yizhou/fix
61f09d70
Merge branch 'master' into yizhou/fix
28778c03
Merge branch 'master' into yizhou/fix
9bf11e23
abhilash1910
commented on 2023-03-27
Merge branch 'master' into yizhou/fix
7f4e3fac
Merge branch 'master' into yizhou/fix
24b98634
Merge branch 'master' into yizhou/fix
eb7d8623
Merge branch 'master' into yizhou/fix
d397366e
Merge branch 'master' into yizhou/fix
3987cd29
Merge branch 'master' into yizhou/fix
330363d4
Merge branch 'master' into yizhou/fix
6f230231
Merge branch 'master' into yizhou/fix
50bd160f
Merge branch 'master' into yizhou/fix
2ec06001
* fix format issue
c43d7cd3
Merge branch 'master' into yizhou/fix
972f4723
Merge branch 'master' into yizhou/fix
38470b83
Merge branch 'master' into yizhou/fix
85e713f3
* add TODO for integrated testing of TP and ZeRO 1/2/3
bf10543f
Merge branch 'master' into yizhou/fix
6144678e
tjruwase
enabled auto-merge (squash)
2 years ago
Merge branch 'master' into yizhou/fix
7c509369
Merge branch 'master' into yizhou/fix
ad8fc9ba
Merge branch 'master' into yizhou/fix
03972ca7
Merge branch 'master' into yizhou/fix
9e15ab0f
Merge branch 'master' into yizhou/fix
2b5499c9
Merge branch 'master' into yizhou/fix
e0803acd
Merge branch 'master' into yizhou/fix
705cce57
Merge branch 'master' into yizhou/fix
d55f6bbe
Merge branch 'master' into yizhou/fix
62601e8c
Merge branch 'master' into yizhou/fix
144c85ed
fix default pg error
ac9aff0d
disabled auto-merge
2 years ago
Head branch was pushed to by a user without write access
Merge branch 'master' into yizhou/fix
d9ff81af
Merge branch 'master' into yizhou/fix
74fc1cca
Merge branch 'master' into yizhou/fix
9d54e1a8
Merge branch 'master' into yizhou/fix
35d4e693
tjruwase
added
merge-queue
Merge branch 'master' into yizhou/fix
4f9dc6e9
tjruwase
merged
9f4a8763
into master
2 years ago
YizhouZ
deleted the yizhou/fix branch
2 years ago
Login to write a write a comment.
Login via GitHub
Reviewers
tjruwase
abhilash1910
jeffra
samyam
mrwyattii
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub