Checkpoint reshaping #1953
unit test, remove exception, add notes
24fe7002
Merge branch 'master' of github.com:microsoft/DeepSpeed into elastic-…
70a68d08
Move param_shapes to model files
aafa4e57
Remove hard-coded constants
162c19b3
Merge branch 'olruwase/relocate_param_shapes' of github.com:microsoft…
84c5d170
Merge branch 'master' into olruwase/relocate_param_shapes
59e86dd0
Conditioned to zero optimizer
680e6207
Merge branch 'olruwase/relocate_param_shapes' of github.com:microsoft…
8bf3c4e1
Add zero checkpoint merging
f1b5d16b
Merge branch 'olruwase/relocate_param_shapes' of github.com:microsoft…
58d34953
Merge branch 'master' into olruwase/relocate_param_shapes
145638d8
Print checkpoint version
fd8c3e68
Merge branch 'olruwase/relocate_param_shapes' of github.com:microsoft…
d85a6df0
Merge with relocate_param_shapes
c642600c
Reshape zero_* ckpt files
c8689fd2
Merge zero* files contraction
4a86c1a5
Utils for 3D contraction reshaping
f5db8df8
Rebase
d5c68438
Merge branch 'master' into olruwase/elastic-ckpt-refresh
e6179201
Merge branch 'master' into olruwase/elastic-ckpt-refresh
ef8a4a73
Remove bogus import
c12a4e7f
Merge branch 'olruwase/elastic-ckpt-refresh' of github.com:microsoft/…
0b2c33bf
Merge branch 'master' into olruwase/elastic-ckpt-refresh
86efe30f
Support bf16_zero ckpts
1031b324
Merge branch 'olruwase/elastic-ckpt-refresh' of github.com:microsoft/…
6f294658
Merge branch 'master' of github.com:microsoft/DeepSpeed into olruwase…
8f23728c
Add param slice mappings
fd1a377f
Merge branch 'master' into olruwase/elastic-ckpt-refresh
3d4a27b5
Load universal checkpoints
10083db7
Merge branch 'olruwase/elastic-ckpt-refresh' of github.com:microsoft/…
567454a5
Per group mappings from Stas
22c75505
Hack to load bf16 zero files
5df4135c
Param attributes
ae2825fd
WIP
d11a8dc2
Merge branch 'master' into olruwase/elastic-ckpt-refresh
7948c45a
Fix api bug
691b29d1
Merge branch 'olruwase/elastic-ckpt-refresh' of github.com:microsoft/…
a05f9532
Update lp with local/remote hp
c0a42d36
Disable vocab padding handling
b4ca4556
Update z2 checkpoint
b8b54c83
Remove debug prints
be86df9b
Remove debug prints; Rebase unit test
c87543b7
Add reshape assert
c18ff2d0
Padding
4ea36b74
Typo
03715817
Catch nonexistent checkpoint path
a74abc1e
Merge branch 'master' into olruwase/elastic-ckpt-refresh
2b707f2d
Cleanup
529dbaeb
Merge branch 'olruwase/elastic-ckpt-refresh' of github.com:microsoft/…
e126d2e4
jeffra
commented
on 2022-06-09
Restore checkpoint state comparisons
9e2766fa
mrwyattii
approved these changes
on 2022-06-10
jeffra
approved these changes
on 2022-06-10
Merge branch 'master' into olruwase/elastic-ckpt-refresh
5c90ef1e
Merge branch 'master' into olruwase/elastic-ckpt-refresh
726982ba
Merge branch 'master' into olruwase/elastic-ckpt-refresh
add1d0c9
Merge branch 'master' into olruwase/elastic-ckpt-refresh
5fca3db8
Merge branch 'master' into olruwase/elastic-ckpt-refresh
901b1e63
Merge branch 'master' into olruwase/elastic-ckpt-refresh
30896ded
Merge branch 'master' into olruwase/elastic-ckpt-refresh
93934f6a
Merge branch 'master' into olruwase/elastic-ckpt-refresh
ecb3dc8a
Merge branch 'master' into olruwase/elastic-ckpt-refresh
6c7d947e
stas00
commented
on 2022-07-04
stas00
commented
on 2022-07-04
Merge branch 'master' into olruwase/elastic-ckpt-refresh
cd8dea73
Merge branch 'master' into olruwase/elastic-ckpt-refresh
4217be24
Merge branch 'master' into olruwase/elastic-ckpt-refresh
206e630f
Add torch version guards
14980ad4
Merge branch 'master' into olruwase/elastic-ckpt-refresh
f3145818
More precise avoidance of false positives.
868c463a
Merge branch 'olruwase/elastic-ckpt-refresh' of github.com:microsoft/…
e22487af
Merge branch 'master' into olruwase/elastic-ckpt-refresh
e0da15f9
Merge branch 'master' into olruwase/elastic-ckpt-refresh
623430e0
Merge branch 'master' into olruwase/elastic-ckpt-refresh
2556578b
Merge branch 'master' into olruwase/elastic-ckpt-refresh
bf57d814
Merge branch 'master' into olruwase/elastic-ckpt-refresh
e4a5a464
tjruwase
merged
80d0a32f
into master 3 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub