Training multiple models #7018
Support multiple engines
f488b46c
Use module backward prehook
65648843
Remove pdb
7745ae57
Remove dead code
a4ea1209
Add module forward hooks
3a7a94fc
Rebase branch
1ad82763
Formatting
1e2595fa
Merge branch 'master' of github.com:microsoft/DeepSpeed into olruwase…
4abfd9f9
Cleanup
204c4dd1
Cleanup
78e19152
stas00
commented
on 2025-02-08
Bug fix
16d60bc3
Prepare gradient handling in zero stage 1 & 2
02477ceb
Merge branch 'master' into olruwase/zero_multi_models
2f84032c
Merge branch 'master' into olruwase/zero_multi_models
5ded3a92
Merge branch 'master' into olruwase/zero_multi_models
69c1489e
Add unit tests
3b868609
Merge branch 'master' into olruwase/zero_multi_models
7edabdb4
Formatting
a7744daf
Fix CI failures due to curriculum learning
b5c556d2
Merge branch 'master' into olruwase/zero_multi_models
c20e3930
Merge branch 'master' into olruwase/zero_multi_models
19e9c1d4
Merge branch 'master' into olruwase/zero_multi_models
51f7bf64
stas00
commented
on 2025-03-05
Update deepspeed/runtime/engine.py
2d267f84
Update deepspeed/runtime/engine.py
51de6710
stas00
approved these changes
on 2025-03-06
Multiple models with indepdent loss (legacy case)
acc22eec
Merge branch 'olruwase/zero_multi_models' of github.com:microsoft/Dee…
86f08f82
Update UT and docs
e5a49580
Tweak RTD
82372c39
Tweak RTD
f50f5e69
Merge branch 'master' into olruwase/zero_multi_models
e7fe8148
Merge branch 'olruwase/zero_multi_models' of github.com:microsoft/Dee…
f7792f9c
Merge branch 'master' into olruwase/zero_multi_models
3002fcec
tjruwase
merged
b418cf6c
into master 291 days ago
tjruwase
deleted the olruwase/zero_multi_models branch 291 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub