DeepSpeed
Tensor parallelism for Mixture of Experts
#2074
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
55
Changes
View On
GitHub
Tensor parallelism for Mixture of Experts
#2074
siddharth9820
merged 55 commits into
master
from
moe-tensor-parallelism
add tensor parallelism support for non-expert groups
347d450c
non-expert tensor parallelism - drop tokens before a2a
43309364
support tensor parallelism for non-experts
2643c18e
fix formatting
f96e0a03
siddharth9820
requested a review
from
jeffra
3 years ago
siddharth9820
requested a review
from
samyam
3 years ago
siddharth9820
requested a review
from
tjruwase
3 years ago
siddharth9820
requested a review
from
ShadenSmith
3 years ago
siddharth9820
requested a review
from
conglongli
3 years ago
siddharth9820
requested a review
from
awan-10
3 years ago
siddharth9820
requested a review
from
cli99
3 years ago
siddharth9820
requested a review
from
eltonzheng
3 years ago
siddharth9820
requested a review
from
minjiaz
3 years ago
siddharth9820
requested a review
from
RezaYazdaniAminabadi
3 years ago
siddharth9820
requested a review
from
duli2012
3 years ago
siddharth9820
requested a review
from
mrwyattii
3 years ago
siddharth9820
requested a review
from
yaozhewei
3 years ago
siddharth9820
requested a review
from
arashb
3 years ago
siddharth9820
requested a review
from
xiaoxiawu-microsoft
3 years ago
Merge branch 'master' into moe-tensor-parallelism
2dfd09c9
Merge branch 'master' into moe-tensor-parallelism
7af3e870
Merge branch 'master' into moe-tensor-parallelism
1c4e8a72
siddharth9820
requested a review
from
samadejacobs
3 years ago
migrate code for dropping tokens from megatron
763fb191
change gather function name
0a797fee
fall back to previous error message
32063d99
Merge branch 'master' into moe-tensor-parallelism
b3e2fd84
awan-10
commented on 2022-07-15
awan-10
commented on 2022-07-15
Merge branch 'master' into moe-tensor-parallelism
1d2d975f
formatting changes
ef4feb18
change function names
ed731d03
Merge branch 'master' into moe-tensor-parallelism
7712d906
Merge branch 'master' into moe-tensor-parallelism
eb6dd0a8
siddharth9820
changed the title
Tensor parallelism for Non-Experts
Tensor parallelism for Mixture of Experts
3 years ago
siddharth9820
commented on 2022-07-20
Merge branch 'master' into moe-tensor-parallelism
9b4cd1fd
Merge branch 'master' into moe-tensor-parallelism
fda3714c
siddharth9820
commented on 2022-07-20
siddharth9820
commented on 2022-07-20
siddharth9820
commented on 2022-07-20
Merge branch 'master' into moe-tensor-parallelism
7962586e
conglongli
commented on 2022-07-20
siddharth9820
commented on 2022-07-20
conglongli
commented on 2022-07-20
siddharth9820
commented on 2022-07-20
fix number of local experts
8006a1d7
Merge branch 'moe-tensor-parallelism' of github.com:microsoft/DeepSpe…
43781123
fix documentation
c9fa9978
correct log statement
474c9327
Merge branch 'master' into moe-tensor-parallelism
4fe355dd
roll back ep-size setting code
555ad4fe
Merge branch 'moe-tensor-parallelism' of github.com:microsoft/DeepSpe…
1299a50f
add detailed comments
ad0a147b
restore function in groupy.py
e29ddf6a
better comments
f8469b7d
Merge branch 'master' into moe-tensor-parallelism
f4f217cb
Merge branch 'master' into moe-tensor-parallelism
baa98122
remove code that changes ep_size and convert it to asserts
9bdeb414
correct groups
d34f69c2
Merge branch 'master' into moe-tensor-parallelism
01c61b8c
correction
dd999c03
add copyright
382da2ef
correction
d00c7422
siddharth9820
requested a review
from
awan-10
3 years ago
siddharth9820
requested a review
from
conglongli
3 years ago
Merge branch 'master' into moe-tensor-parallelism
6a561802
conglongli
approved these changes on 2022-07-26
Merge branch 'master' into moe-tensor-parallelism
da5a6884
Merge branch 'master' into moe-tensor-parallelism
0f3f398f
awan-10
approved these changes on 2022-07-26
Merge branch 'master' into moe-tensor-parallelism
725c66be
jeffra
commented on 2022-07-26
Merge branch 'master' into moe-tensor-parallelism
ae0030d3
formatting changes
3d6a1367
Merge branch 'master' into moe-tensor-parallelism
870dfd06
Merge branch 'master' into moe-tensor-parallelism
c5acd1cf
Merge branch 'master' into moe-tensor-parallelism
a1c470e3
Merge branch 'master' into moe-tensor-parallelism
43216ca3
Merge branch 'master' into moe-tensor-parallelism
b6dd6ea3
Merge branch 'master' into moe-tensor-parallelism
372c6634
add unit tests
5379a21c
small change
50ae30b8
Merge branch 'master' into moe-tensor-parallelism
5f040c82
jeffra
commented on 2022-07-29
jeffra
approved these changes on 2022-07-29
remove amp from tests
8dfe33dc
Merge branch 'moe-tensor-parallelism' of github.com:microsoft/DeepSpe…
f918175c
siddharth9820
enabled auto-merge (squash)
3 years ago
disabled auto-merge
3 years ago
Manually disabled by user
Merge branch 'master' into moe-tensor-parallelism
0afa114b
siddharth9820
enabled auto-merge (squash)
3 years ago
disabled auto-merge
3 years ago
Manually disabled by user
siddharth9820
enabled auto-merge (squash)
3 years ago
siddharth9820
merged
5fe9d610
into master
3 years ago
siddharth9820
deleted the moe-tensor-parallelism branch
3 years ago
Login to write a write a comment.
Login via GitHub
Reviewers
jeffra
awan-10
conglongli
samyam
tjruwase
ShadenSmith
cli99
eltonzheng
minjiaz
RezaYazdaniAminabadi
duli2012
mrwyattii
yaozhewei
arashb
xiaoxiawu-microsoft
samadejacobs
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub