Reshape deepspeed checkpoint (#239)
* Reshape deepspeed checkpoint
* add checkpoint tests
* Validate input folder
* Tests for tp/pp reshape
* remove debug folders
* fix test_checkpoint_reshaping_empty_dir
* Fix unit tests
* Remove deepspeed checkpoint utils
* Use DS 3D reshaping utils
* convert to bf16
* wip universal chkpt
* rename
* rename
* wip on fragments dealing
* cleanup
* Loading universal checkpoint with reshaping
* all gpu1<->2 reshapes work
* param attrs
* make the tests adaptable to the number of available gpus
* WIP
* WIP
* WIP
* WIP
* Debug functions
* args should be required, don't create another latest file
* Parallelize shard extraction
* close+join pool; add tqdm; comment out noise
* rename
* parameterize
* Parallel slice merging
* Cleanup
* allow inspection on a machine w/o gpus
* test against the right DS branch
* DS size was merged
Co-authored-by: Stas Bekman <stas@stason.org>