DeepSpeed
use ```non_reentrant_checkpoint``` fix requires_grad of input must be true for activation checkpoint layer in pipeline train.
#4224
Merged

use ```non_reentrant_checkpoint``` fix requires_grad of input must be true for activation checkpoint layer in pipeline train. #4224

tohtana merged 36 commits into deepspeedai:master from inkcherry:use_reentrant
inkcherry
hughpu feat: add `non_reentrant_checkpoint`
a20c79ca
hughpu feat: add missing output postprocess and change the hook to record le…
8aeba5f6
hughpu fix: make the multi_grad_hook registered after graph construction
ee04fa8f
hughpu fix: backward compatibility for multi_tensor_hook
51f833d4
hughpu fix: nonlocal reference error of deepspeed_saved_tensors
b29c1efa
hughpu fix: reduce repeating hook registration
37e7c234
hughpu Merge branch 'microsoft:master' into feat/non-reentrant-checkpoint
d7c5440c
hughpu test: add test for `activation_checkpointing.checkpointing.non_reentr…
e22c4877
cmikeh2 Pass correct node size for ZeRO++ (#4085)
4d2a274b
conglongli add deepspeed chat arxiv report (#4110)
d4d070b8
hughpu style: change flake8 detected style missmatch
aaf309e6
hughpu test: hack to clone the `test_activation_checkpointing` module for re…
a9109221
hughpu doc: explain the introduction of `non_reentrant_checkpoint`
fc919b17
hughpu doc: explain the test of `non_reentrant_checkpoint`
b6a0a44b
hughpu Merge branch 'microsoft:master' into feat/non-reentrant-checkpoint
8ec86a4c
hughpu Merge branch 'master' into feat/non-reentrant-checkpoint
78c0d65d
hughpu Merge branch 'master' into feat/non-reentrant-checkpoint
e4eff23a
hughpu Merge branch 'master' into feat/non-reentrant-checkpoint
a6c78715
hughpu Merge branch 'master' into feat/non-reentrant-checkpoint
fbbb7604
hughpu Merge branch 'master' into feat/non-reentrant-checkpoint
a338097c
hughpu Merge branch 'master' into feat/non-reentrant-checkpoint
a00cff1e
hughpu Merge branch 'master' into feat/non-reentrant-checkpoint
c17cc3d0
hughpu Merge branch 'master' into feat/non-reentrant-checkpoint
a680399f
hughpu Merge branch 'master' into feat/non-reentrant-checkpoint
13e766de
hughpu Merge branch 'master' into feat/non-reentrant-checkpoint
a46e3260
hughpu Merge branch 'master' into feat/non-reentrant-checkpoint
b5c03f42
tjruwase Merge branch 'master' into feat/non-reentrant-checkpoint
13a026d4
inkcherry apply non_reentrant_checkpoint in pipeline parallel training
0c18dda3
inkcherry ut pass
71421bfc
inkcherry inkcherry requested a review from ShadenSmith ShadenSmith 2 years ago
inkcherry inkcherry requested a review from duli2012 duli2012 2 years ago
inkcherry inkcherry requested a review from jeffra jeffra 2 years ago
inkcherry inkcherry requested a review from tjruwase tjruwase 2 years ago
inkcherry inkcherry requested a review from mrwyattii mrwyattii 2 years ago
inkcherry Merge branch 'master' into use_reentrant
18d64d32
tohtana
inkcherry fix ci
d9edc63e
inkcherry Merge branch 'master' into use_reentrant
d98790d2
inkcherry
tohtana
inkcherry reduce check level for ci
c27d3345
inkcherry reduce check level for ci
79b427dd
inkcherry
tohtana Merge branch 'master' into use_reentrant
80beb040
tohtana Merge branch 'master' into use_reentrant
bdc9db0d
tohtana tohtana requested a review from tohtana tohtana 2 years ago
tohtana
tohtana approved these changes on 2023-09-06
tohtana tohtana enabled auto-merge 2 years ago
tohtana tohtana merged 60a3e89e into master 2 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone