[model weights] zero_to_fp32 multiple improvements (#1181)
* add live zero checkpoint to fp32 consolidation version
* some more docs
* zero2 model states uses a different filename
* fix
* make debug mode cli configurable
* copy the script only on node 0 process 0
* validate that we have the right number of files
* revamp _get_zero_param_shapes, instrument with easier debug
* correct assertion
* rename API; add even simpler API
* style
* docs improve
* update the docs
* revert the unpartitioned_params detection and report as it's most likely persistent params
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>