DeepSpeed
[ZeRO] Default disable elastic ckpt in stage 1+2 and reduce CPU memory overhead during ckpt load
#1525
Merged

Commits
  • [squash] zero-ckpt-cpu-issue (#1673)
    jeffra committed 3 years ago
  • formatting
    jeffra committed 3 years ago
  • Merge branch 'master' into zero-ckpt-cpu-issue
    tjruwase committed 3 years ago
  • Reduce cpu memory of loading in rigid mode
    tjruwase committed 3 years ago
  • Merge branch 'master' into zero-ckpt-cpu-issue
    tjruwase committed 3 years ago
  • Allocate tensor on param device
    tjruwase committed 3 years ago
  • Merge branch 'zero-ckpt-cpu-issue' of github.com:microsoft/DeepSpeed into zero-ckpt-cpu-issue
    tjruwase committed 3 years ago
  • Merge branch 'master' into zero-ckpt-cpu-issue
    tjruwase committed 3 years ago
  • add WS check + several unit tests for ckpting (TODO: need to fix a few FT test cases)
    jeffra committed 3 years ago
  • uncomment exception check in ckpt test
    jeffra committed 3 years ago
  • Merge branch 'master' into zero-ckpt-cpu-issue
    jeffra committed 3 years ago
  • Merge branch 'master' into zero-ckpt-cpu-issue
    tjruwase committed 3 years ago
  • Merge branch 'master' into zero-ckpt-cpu-issue
    jeffra committed 3 years ago
  • fixes for remaining unit tests
    jeffra committed 3 years ago
  • Merge branch 'master' into zero-ckpt-cpu-issue
    jeffra committed 3 years ago
Loading