DeepSpeed
Tensor parallelism for Mixture of Experts
#2074
Merged

Tensor parallelism for Mixture of Experts #2074

siddharth9820 merged 55 commits into master from moe-tensor-parallelism
siddharth9820
siddharth9820 add tensor parallelism support for non-expert groups
347d450c
siddharth9820 non-expert tensor parallelism - drop tokens before a2a
43309364
siddharth9820 support tensor parallelism for non-experts
2643c18e
siddharth9820 fix formatting
f96e0a03
siddharth9820 siddharth9820 requested a review from jeffra jeffra 3 years ago
siddharth9820 siddharth9820 requested a review from samyam samyam 3 years ago
siddharth9820 siddharth9820 requested a review from tjruwase tjruwase 3 years ago
siddharth9820 siddharth9820 requested a review from ShadenSmith ShadenSmith 3 years ago
siddharth9820 siddharth9820 requested a review from conglongli conglongli 3 years ago
siddharth9820 siddharth9820 requested a review from awan-10 awan-10 3 years ago
siddharth9820 siddharth9820 requested a review from cli99 cli99 3 years ago
siddharth9820 siddharth9820 requested a review from eltonzheng eltonzheng 3 years ago
siddharth9820 siddharth9820 requested a review from minjiaz minjiaz 3 years ago
siddharth9820 siddharth9820 requested a review from RezaYazdaniAminabadi RezaYazdaniAminabadi 3 years ago
siddharth9820 siddharth9820 requested a review from duli2012 duli2012 3 years ago
siddharth9820 siddharth9820 requested a review from mrwyattii mrwyattii 3 years ago
siddharth9820 siddharth9820 requested a review from yaozhewei yaozhewei 3 years ago
siddharth9820 siddharth9820 requested a review from arashb arashb 3 years ago
siddharth9820 siddharth9820 requested a review from xiaoxiawu-microsoft xiaoxiawu-microsoft 3 years ago
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
2dfd09c9
siddharth9820
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
7af3e870
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
1c4e8a72
siddharth9820 siddharth9820 requested a review from samadejacobs samadejacobs 3 years ago
siddharth9820 migrate code for dropping tokens from megatron
763fb191
siddharth9820 change gather function name
0a797fee
siddharth9820 fall back to previous error message
32063d99
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
b3e2fd84
awan-10
awan-10 commented on 2022-07-15
awan-10
awan-10 commented on 2022-07-15
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
1d2d975f
siddharth9820 formatting changes
ef4feb18
siddharth9820 change function names
ed731d03
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
7712d906
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
eb6dd0a8
siddharth9820
siddharth9820 siddharth9820 changed the title Tensor parallelism for Non-Experts Tensor parallelism for Mixture of Experts 3 years ago
siddharth9820
siddharth9820 commented on 2022-07-20
siddharth9820
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
9b4cd1fd
tjruwase Merge branch 'master' into moe-tensor-parallelism
fda3714c
siddharth9820
siddharth9820 commented on 2022-07-20
siddharth9820
siddharth9820 commented on 2022-07-20
siddharth9820
siddharth9820 commented on 2022-07-20
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
7962586e
conglongli
conglongli commented on 2022-07-20
siddharth9820
siddharth9820 commented on 2022-07-20
conglongli
conglongli commented on 2022-07-20
siddharth9820
siddharth9820 commented on 2022-07-20
siddharth9820 fix number of local experts
8006a1d7
siddharth9820 Merge branch 'moe-tensor-parallelism' of github.com:microsoft/DeepSpe…
43781123
siddharth9820 fix documentation
c9fa9978
siddharth9820
siddharth9820 correct log statement
474c9327
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
4fe355dd
siddharth9820 roll back ep-size setting code
555ad4fe
siddharth9820 Merge branch 'moe-tensor-parallelism' of github.com:microsoft/DeepSpe…
1299a50f
siddharth9820 add detailed comments
ad0a147b
siddharth9820 restore function in groupy.py
e29ddf6a
siddharth9820 better comments
f8469b7d
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
f4f217cb
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
baa98122
siddharth9820 remove code that changes ep_size and convert it to asserts
9bdeb414
siddharth9820 correct groups
d34f69c2
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
01c61b8c
siddharth9820 correction
dd999c03
siddharth9820 add copyright
382da2ef
siddharth9820 correction
d00c7422
siddharth9820 siddharth9820 requested a review from awan-10 awan-10 3 years ago
siddharth9820 siddharth9820 requested a review from conglongli conglongli 3 years ago
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
6a561802
conglongli
conglongli approved these changes on 2022-07-26
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
da5a6884
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
0f3f398f
awan-10
awan-10 approved these changes on 2022-07-26
awan-10 Merge branch 'master' into moe-tensor-parallelism
725c66be
jeffra
jeffra commented on 2022-07-26
jeffra
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
ae0030d3
siddharth9820 formatting changes
3d6a1367
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
870dfd06
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
c5acd1cf
tjruwase Merge branch 'master' into moe-tensor-parallelism
a1c470e3
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
43216ca3
tjruwase Merge branch 'master' into moe-tensor-parallelism
b6dd6ea3
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
372c6634
siddharth9820 add unit tests
5379a21c
siddharth9820 small change
50ae30b8
siddharth9820 Merge branch 'master' into moe-tensor-parallelism
5f040c82
siddharth9820
jeffra
jeffra commented on 2022-07-29
jeffra
jeffra approved these changes on 2022-07-29
siddharth9820 remove amp from tests
8dfe33dc
siddharth9820 Merge branch 'moe-tensor-parallelism' of github.com:microsoft/DeepSpe…
f918175c
siddharth9820 siddharth9820 enabled auto-merge (squash) 3 years ago
disabled auto-merge 3 years ago
Manually disabled by user
tjruwase Merge branch 'master' into moe-tensor-parallelism
0afa114b
siddharth9820 siddharth9820 enabled auto-merge (squash) 3 years ago
disabled auto-merge 3 years ago
Manually disabled by user
siddharth9820 siddharth9820 enabled auto-merge (squash) 3 years ago
siddharth9820 siddharth9820 merged 5fe9d610 into master 3 years ago
siddharth9820 siddharth9820 deleted the moe-tensor-parallelism branch 3 years ago
jerryli1981

Login to write a write a comment.

Login via GitHub