DeepSpeed
use ```non_reentrant_checkpoint``` fix requires_grad of input must be true for activation checkpoint layer in pipeline train.
#4224
Merged

Commits
  • feat: add `non_reentrant_checkpoint`
    hughpu committed 2 years ago
  • feat: add missing output postprocess and change the hook to record leaf forward tensor refs
    hughpu committed 2 years ago
  • fix: make the multi_grad_hook registered after graph construction
    hughpu committed 2 years ago
  • fix: backward compatibility for multi_tensor_hook
    hughpu committed 2 years ago
  • fix: nonlocal reference error of deepspeed_saved_tensors
    hughpu committed 2 years ago
  • fix: reduce repeating hook registration
    hughpu committed 2 years ago
  • Merge branch 'microsoft:master' into feat/non-reentrant-checkpoint
    hughpu committed 2 years ago
  • test: add test for `activation_checkpointing.checkpointing.non_reentrant_checkpoint`
    hughpu committed 2 years ago
  • Pass correct node size for ZeRO++ (#4085)
    hughpu committed 2 years ago
  • add deepspeed chat arxiv report (#4110)
    hughpu committed 2 years ago
  • style: change flake8 detected style missmatch
    hughpu committed 2 years ago
  • test: hack to clone the `test_activation_checkpointing` module for reuse and add regression tests
    hughpu committed 2 years ago
  • doc: explain the introduction of `non_reentrant_checkpoint`
    hughpu committed 2 years ago
  • doc: explain the test of `non_reentrant_checkpoint`
    hughpu committed 2 years ago
  • Merge branch 'microsoft:master' into feat/non-reentrant-checkpoint
    hughpu committed 2 years ago
  • Merge branch 'master' into feat/non-reentrant-checkpoint
    hughpu committed 2 years ago
  • Merge branch 'master' into feat/non-reentrant-checkpoint
    hughpu committed 2 years ago
  • Merge branch 'master' into feat/non-reentrant-checkpoint
    hughpu committed 2 years ago
  • Merge branch 'master' into feat/non-reentrant-checkpoint
    hughpu committed 2 years ago
  • Merge branch 'master' into feat/non-reentrant-checkpoint
    hughpu committed 2 years ago
  • Merge branch 'master' into feat/non-reentrant-checkpoint
    hughpu committed 2 years ago
  • Merge branch 'master' into feat/non-reentrant-checkpoint
    hughpu committed 2 years ago
  • Merge branch 'master' into feat/non-reentrant-checkpoint
    hughpu committed 2 years ago
  • Merge branch 'master' into feat/non-reentrant-checkpoint
    hughpu committed 2 years ago
  • Merge branch 'master' into feat/non-reentrant-checkpoint
    hughpu committed 2 years ago
  • Merge branch 'master' into feat/non-reentrant-checkpoint
    hughpu committed 2 years ago
  • Merge branch 'master' into feat/non-reentrant-checkpoint
    tjruwase committed 2 years ago
  • apply non_reentrant_checkpoint in pipeline parallel training
    inkcherry committed 2 years ago
  • ut pass
    inkcherry committed 2 years ago
  • Merge branch 'master' into use_reentrant
    inkcherry committed 2 years ago
  • fix ci
    inkcherry committed 2 years ago
  • Merge branch 'master' into use_reentrant
    inkcherry committed 2 years ago
  • reduce check level for ci
    inkcherry committed 2 years ago
  • reduce check level for ci
    inkcherry committed 2 years ago
  • Merge branch 'master' into use_reentrant
    tohtana committed 2 years ago
  • Merge branch 'master' into use_reentrant
    tohtana committed 2 years ago
Loading