DeepSpeed
ZeRO-1 tune max-elems + bug fix
#532
Merged

ZeRO-1 tune max-elems + bug fix #532

jeffra merged 15 commits into master from jeffra/zero-1-fix-test
jeffra
jeffra zero-1 memory fix
8fc88cce
jeffra clean-up and added previously missing reduction options
f39335c8
jeffra init self.local_sub_partitions
45b7afae
jeffra code to test
5edf3ed1
jeffra cleanup
bc8a2e0c
jeffra missing fix
c990b871
jeffra fix for empty partitions
cc59e10d
jeffra pick best max elems per comm to reduce extra padding
c35f2371
jeffra fix testing backing to work with torch1.7, fix div by 0
519ea19f
jeffra formatting and less aggressive assert
5a401871
jeffra map max elems per comm into checkpoints
c91dae82
jeffra formatting
94c0c588
jeffra jeffra requested a review from arashashari arashashari 5 years ago
jeffra jeffra requested a review from awan-10 awan-10 5 years ago
jeffra jeffra requested a review from cli99 cli99 5 years ago
jeffra jeffra requested a review from conglongli conglongli 5 years ago
jeffra jeffra requested a review from eltonzheng eltonzheng 5 years ago
jeffra jeffra requested a review from minjiaz minjiaz 5 years ago
jeffra jeffra requested a review from niumanar niumanar 5 years ago
jeffra jeffra requested a review from RezaYazdaniAminabadi RezaYazdaniAminabadi 5 years ago
jeffra jeffra requested a review from samyam samyam 5 years ago
jeffra jeffra requested a review from ShadenSmith ShadenSmith 5 years ago
jeffra jeffra requested a review from tjruwase tjruwase 5 years ago
tjruwase
tjruwase approved these changes on 2020-11-18
jeffra jeffra force pushed from f5cc4cdd to c91dae82 5 years ago
tjruwase
tjruwase approved these changes on 2020-11-18
jeffra add dp size to padding to ensure we have enough to split across ranks
1702415d
jeffra fix padding logic to align with world size
19fa14fa
jeffra Merge branch 'master' into jeffra/zero-1-fix-test
bebc5eb7
tjruwase
tjruwase approved these changes on 2020-11-19
jeffra jeffra merged 08c96a1b into master 5 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone