Improve checkpointing for Zero stage 1 #5478
ashbhandare
dismissed their stale review
via df4070cb
5 years ago
ashbhandare
force pushed
from
c27b5295
to
df4070cb
5 years ago
ashbhandare
dismissed their stale review
via 5a453d00
5 years ago
ashbhandare
force pushed
from
20b7fead
to
705d25e3
5 years ago
ashbhandare
force pushed
from
31f1ca8b
to
dd2db4c3
5 years ago
ashbhandare
force pushed
from
0d286b1e
to
372e3724
5 years ago
Initial running changes
d71857d8
Checkpointing aggregation changes
2cbe436c
compare with older version
903a44aa
initial cleanup
b31e833e
Add zero test, minor fix
340ef43e
Fix zero test, transform, formatting
3d776787
Review comments
1757e5cc
add more unit tests
bd402344
review comments
0146849e
Try fix CI
29510b78
Add additional check on just aggregation code
9055d292
Try fix ckpt gen
82fef12d
Add pregenerated ckpt for CI, enable zero test in e2e
30d2ce78
Moving test to nightly, removing ckpt files
a928d873
Add tests to dist GPU CI
6adec1b4
Fix dist test
99ca90dd
Review comments
5bc59119
ashbhandare
force pushed
from
59a2698d
to
5bc59119
5 years ago
Fix test
eac63e6c
ashbhandare
deleted the aibhanda/zero_1_ckpt branch 5 years ago
Assignees
No one assigned