DeepSpeed
[model weights] zero_to_fp32 multiple improvements
#1181
Merged

Commits
  • add live zero checkpoint to fp32 consolidation version
    stas00 committed 4 years ago
  • some more docs
    stas00 committed 4 years ago
  • zero2 model states uses a different filename
    stas00 committed 4 years ago
  • fix
    stas00 committed 4 years ago
  • Merge remote-tracking branch 'origin/master' into z2fp32-auto
    stas00 committed 4 years ago
  • make debug mode cli configurable
    stas00 committed 4 years ago
  • copy the script only on node 0 process 0
    stas00 committed 4 years ago
  • validate that we have the right number of files
    stas00 committed 4 years ago
  • revamp _get_zero_param_shapes, instrument with easier debug
    stas00 committed 4 years ago
  • correct assertion
    stas00 committed 4 years ago
  • rename API; add even simpler API
    stas00 committed 4 years ago
  • style
    stas00 committed 4 years ago
  • docs improve
    stas00 committed 4 years ago
  • update the docs
    stas00 committed 4 years ago
  • Merge branch 'master' into z2fp32-auto
    tjruwase committed 4 years ago
  • Merge branch 'master' into z2fp32-auto
    tjruwase committed 4 years ago
  • Merge remote-tracking branch 'origin/master' into z2fp32-auto
    stas00 committed 4 years ago
  • Merge branch 'master' into z2fp32-auto
    tjruwase committed 4 years ago
  • Merge branch 'master' into z2fp32-auto
    tjruwase committed 4 years ago
  • revert the unpartitioned_params detection and report as it's most likely persistent params
    stas00 committed 4 years ago
  • Merge branch 'master' into z2fp32-auto
    tjruwase committed 4 years ago
Loading