DeepSpeed
Fix gradient checkpointing with use_reentrant=True / PyTorch-style backward / ZeRO-3
#7780
Merged

Fix gradient checkpointing with use_reentrant=True / PyTorch-style backward / ZeRO-3 #7780

tohtana
tohtana tohtana requested a review from tjruwase tjruwase 55 days ago
tohtana tohtana requested a review from loadams loadams 55 days ago
tohtana fix backward with checkpointing and reentrant
a09084b0
loadams Update README with newer status badges for CI
8211447d
tohtana Add timeout to test workflows (#7774)
f6026d19
loadams Remove cron/PR triggers for outdated V100 tests (#7777)
3cf426cd
tohtana fix yapf formatting in test file
9116f4a9
tohtana tohtana force pushed from 40899c62 to 9116f4a9 55 days ago
tohtana Merge branch 'master' into tohtana/backward_with_reentrant
cc00bd9b
PKUWZP
PKUWZP approved these changes on 2026-01-15
tohtana added sync in tests
f61abc90
loadams Merge branch 'master' into tohtana/backward_with_reentrant
0c994459
loadams
loadams approved these changes on 2026-01-15
tohtana extract function to clear params
84fa1db0
tohtana Merge branch 'tohtana/backward_with_reentrant' of github.com:tohtana/…
90926ba1
tohtana fix issue with backward count
ddb54e76
tohtana fix backward hook state management
3f6938ea
tohtana fix for zero1
db1ff062
tohtana fix micro step id count
e2de9a4a
tohtana Merge branch 'master' into tohtana/backward_with_reentrant
c33e14f2
tohtana tohtana merged 311674ff into master 53 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone