DeepSpeed
Improving memory utilization of Z2+MoE
#2079
Merged

Improving memory utilization of Z2+MoE #2079

siddharth9820 merged 14 commits into master from zero2_optim_tiling
siddharth9820
siddharth9820 add maximum param group size to moe
47af7289
siddharth9820 process optimizer groups individually
8d31edb0
siddharth9820 correct timer placement
e6702d88
siddharth9820 tested with 1.3B 2.7B and 6.7B
7dd3da28
siddharth9820 siddharth9820 requested a review from jeffra jeffra 3 years ago
siddharth9820 siddharth9820 requested a review from samyam samyam 3 years ago
siddharth9820 siddharth9820 requested a review from tjruwase tjruwase 3 years ago
siddharth9820 siddharth9820 requested a review from ShadenSmith ShadenSmith 3 years ago
siddharth9820 siddharth9820 requested a review from conglongli conglongli 3 years ago
siddharth9820 siddharth9820 requested a review from awan-10 awan-10 3 years ago
siddharth9820 siddharth9820 requested a review from cli99 cli99 3 years ago
siddharth9820 siddharth9820 requested a review from eltonzheng eltonzheng 3 years ago
siddharth9820 siddharth9820 requested a review from minjiaz minjiaz 3 years ago
siddharth9820 siddharth9820 requested a review from RezaYazdaniAminabadi RezaYazdaniAminabadi 3 years ago
siddharth9820 siddharth9820 requested a review from duli2012 duli2012 3 years ago
siddharth9820 siddharth9820 requested a review from mrwyattii mrwyattii 3 years ago
siddharth9820 siddharth9820 requested a review from yaozhewei yaozhewei 3 years ago
siddharth9820 siddharth9820 requested a review from arashb arashb 3 years ago
siddharth9820 siddharth9820 requested a review from xiaoxiawu-microsoft xiaoxiawu-microsoft 3 years ago
siddharth9820 Merge branch 'master' into zero2_optim_tiling
be0d2fe4
siddharth9820 correction in DeepSpeedCPUAdam
b02a34f4
siddharth9820 torch 1.8.0 backwards compatibility
57f64c16
siddharth9820 Merge branch 'master' into zero2_optim_tiling
563d3eb0
siddharth9820 siddharth9820 requested a review from samadejacobs samadejacobs 3 years ago
siddharth9820 modify optimizer groups dynamically
1d9852cd
siddharth9820 correction for DSCpuAdam
af4da67f
siddharth9820
siddharth9820 commented on 2022-07-12
tjruwase
tjruwase approved these changes on 2022-07-12
tjruwase Merge branch 'master' into zero2_optim_tiling
2ff9812b
siddharth9820 restored comments
41a7d161
siddharth9820 Merge branch 'zero2_optim_tiling' of github.com:microsoft/DeepSpeed i…
e9a1cd33
siddharth9820 remove print statements
3209c3b3
siddharth9820 siddharth9820 merged c1af73f7 into master 3 years ago
siddharth9820 siddharth9820 deleted the zero2_optim_tiling branch 3 years ago

Login to write a write a comment.

Login via GitHub