DeepSpeed
2a921069 - [model weights] zero_to_fp32 multiple improvements (#1181)

Commit
4 years ago
[model weights] zero_to_fp32 multiple improvements (#1181) * add live zero checkpoint to fp32 consolidation version * some more docs * zero2 model states uses a different filename * fix * make debug mode cli configurable * copy the script only on node 0 process 0 * validate that we have the right number of files * revamp _get_zero_param_shapes, instrument with easier debug * correct assertion * rename API; add even simpler API * style * docs improve * update the docs * revert the unpartitioned_params detection and report as it's most likely persistent params Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Author
Parents
Loading