DeepSpeed
Refactor MoE and Groups API to simplify model creation and mangement
#1798
Merged

Refactor MoE and Groups API to simplify model creation and mangement #1798

jeffra merged 38 commits into master from amawa/moe-api-updates
awan-10
awan-10 add docs/proposal.
6f212850
awan-10 save pending code.
9ce86f5a
awan-10 add more changes.
bb008c9a
awan-10 Add changes to groups.
08e8a455
awan-10 use proper world group instead of None.
e431ef37
awan-10 fix the world group by cloning it
d2b7a445
awan-10 fix group creation and return.
ec484adb
awan-10 fix typo
8cc33c08
awan-10 another stupid typo.
966c36be
awan-10 fix failed unit tests.
95796951
awan-10 use the correct method to get num_experts
b222a7b3
awan-10 cleanup the layer code.
39197e88
awan-10 remove the old code
90e0c967
awan-10 Move group creation to enginer. Fix several small things. Use helper …
3c9d379f
awan-10 fix
55b897e0
awan-10 Merge branch 'master' into amawa/moe-api-updates
21a08ba5
awan-10 Zheweiyao/moe post release residual moe (#289) (#290)
bd74c0b3
awan-10 feedback from Jeff/ammar review
292a33e2
awan-10 Fix unit tests and add checks to create groups.
e149ea69
awan-10 fix failing moe tests.
b39145a7
awan-10 fix name typo
08b31911
awan-10 pass ep_size down to the layer.
addb7920
awan-10 return rank
d318b032
awan-10 All test pass. Fix pr-moe test
b93f5628
awan-10 add missing import
881cfc47
awan-10 parallelism spec based group creation.
b9f2aefb
awan-10 fix missing underscores.
3a1ea689
awan-10 Fix module search.
5e9067d3
awan-10 remove print.
477a415f
awan-10 add missing _
61a2e5d3
yaozhewei fix typos, fix split parameters for moe optimizer
e5260071
awan-10 update moe/cifar-moe tutorial.
40ce1b5b
awan-10 explain/fix use_residual.
2713e0d3
awan-10 fix unit tests and remove groups from tutorial. mark todo for inferen…
2e89e0fe
get the moe flag from the right place on the transformer layer
1fedae9b
awan-10 Add in more feedback.
695945c3
awan-10 Add in feedback
5323f746
awan-10 Merge branch 'master' into amawa/moe-api-updates
7d5ae4bd
awan-10 awan-10 requested a review from jeffra jeffra 3 years ago
awan-10 awan-10 requested a review from samyam samyam 3 years ago
awan-10 awan-10 requested a review from tjruwase tjruwase 3 years ago
awan-10 awan-10 requested a review from ShadenSmith ShadenSmith 3 years ago
awan-10 awan-10 requested a review from conglongli conglongli 3 years ago
awan-10 awan-10 requested a review from cli99 cli99 3 years ago
awan-10 awan-10 requested a review from eltonzheng eltonzheng 3 years ago
awan-10 awan-10 requested a review from minjiaz minjiaz 3 years ago
awan-10 awan-10 requested a review from RezaYazdaniAminabadi RezaYazdaniAminabadi 3 years ago
jeffra
jeffra approved these changes on 2022-02-28
jeffra jeffra merged c0af6d90 into master 3 years ago
jeffra jeffra deleted the amawa/moe-api-updates branch 3 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone