DeepSpeed
Universal checkpoint for zero stage 1
#2284
Merged

Commits
  • Refactor universal checkpointing and tensor fragments
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/refactor_universal_checkpoint
    tjruwase committed 3 years ago
  • Formatting
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/refactor_universal_checkpoint
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/refactor_universal_checkpoint
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/refactor_universal_checkpoint
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/refactor_universal_checkpoint
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/refactor_universal_checkpoint
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/refactor_universal_checkpoint
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/refactor_universal_checkpoint
    tjruwase committed 3 years ago
  • Support zero stage1; Expand TP dim
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Remove debug prints
    tjruwase committed 3 years ago
  • Merge branch 'olruwase/zero_1_2_universal_ckpt' of github.com:microsoft/DeepSpeed into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Detect sharded optimizer state
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge master
    tjruwase committed 3 years ago
  • Format fixes
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Encode reshaping guide
    tjruwase committed 3 years ago
  • Merge branch 'olruwase/zero_1_2_universal_ckpt' of github.com:microsoft/DeepSpeed into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • More symbolic constants
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    mrwyattii committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
  • Merge branch 'master' into olruwase/zero_1_2_universal_ckpt
    tjruwase committed 3 years ago
Loading