ZeRO-1 tune max-elems + bug fix #532
zero-1 memory fix
8fc88cce
clean-up and added previously missing reduction options
f39335c8
init self.local_sub_partitions
45b7afae
code to test
5edf3ed1
cleanup
bc8a2e0c
missing fix
c990b871
fix for empty partitions
cc59e10d
pick best max elems per comm to reduce extra padding
c35f2371
fix testing backing to work with torch1.7, fix div by 0
519ea19f
formatting and less aggressive assert
5a401871
map max elems per comm into checkpoints
c91dae82
formatting
94c0c588
tjruwase
approved these changes
on 2020-11-18
jeffra
force pushed
from
f5cc4cdd
to
c91dae82
5 years ago
tjruwase
approved these changes
on 2020-11-18
add dp size to padding to ensure we have enough to split across ranks
1702415d
fix padding logic to align with world size
19fa14fa
Merge branch 'master' into jeffra/zero-1-fix-test
bebc5eb7
tjruwase
approved these changes
on 2020-11-19
jeffra
merged
08c96a1b
into master 5 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub