DeepSpeed
add moe+zero ckpt unit test.
#1429
Merged

add moe+zero ckpt unit test. #1429

awan-10 merged 9 commits into master from amawa/moe-zero-ckpt-test
awan-10
awan-10 add moe+zero ckpt unit test. it is failing as expected.
69f127d0
awan-10 awan-10 requested a review from cli99 cli99 4 years ago
awan-10 awan-10 requested a review from conglongli conglongli 4 years ago
awan-10 awan-10 requested a review from eltonzheng eltonzheng 4 years ago
awan-10 awan-10 requested a review from jeffra jeffra 4 years ago
awan-10 awan-10 requested a review from minjiaz minjiaz 4 years ago
awan-10 awan-10 requested a review from niumanar niumanar 4 years ago
awan-10 awan-10 requested a review from RezaYazdaniAminabadi RezaYazdaniAminabadi 4 years ago
awan-10 awan-10 requested a review from samyam samyam 4 years ago
awan-10 awan-10 requested a review from ShadenSmith ShadenSmith 4 years ago
awan-10 awan-10 requested a review from tjruwase tjruwase 4 years ago
awan-10 modify the test.
f976921c
awan-10 Define helper outside. Fix the Expert-parallel only case for zero+moe.
0645e685
awan-10 Merge branch 'master' into amawa/moe-zero-ckpt-test
ea03da36
awan-10 fix format.
81d816c4
awan-10 Get ep-ranks list. Works for Expert-only but not for E+D.
16598acc
awan-10 fix format.
63d442e9
awan-10 Fix both cases. Cleanup.
8d2af5fa
awan-10 Merge branch 'master' into amawa/moe-zero-ckpt-test
8df06b2d
awan-10 awan-10 enabled auto-merge (squash) 4 years ago
tjruwase
tjruwase approved these changes on 2021-10-22
awan-10 awan-10 merged 0b77d1d9 into master 4 years ago
mrwyattii mrwyattii deleted the amawa/moe-zero-ckpt-test branch 2 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone