DeepSpeed
Training multiple models
#7018
Merged

Training multiple models #7018

tjruwase merged 32 commits into master from olruwase/zero_multi_models
tjruwase
tjruwase Support multiple engines
f488b46c
tjruwase Use module backward prehook
65648843
tjruwase Remove pdb
7745ae57
tjruwase Remove dead code
a4ea1209
tjruwase Add module forward hooks
3a7a94fc
tjruwase Rebase branch
1ad82763
tjruwase Formatting
1e2595fa
tjruwase Merge branch 'master' of github.com:microsoft/DeepSpeed into olruwase…
4abfd9f9
tjruwase Cleanup
204c4dd1
tjruwase Cleanup
78e19152
tjruwase tjruwase requested a review from tohtana tohtana 323 days ago
tjruwase tjruwase requested a review from stas00 stas00 323 days ago
stas00
stas00 commented on 2025-02-08
tjruwase
tjruwase Bug fix
16d60bc3
tjruwase Prepare gradient handling in zero stage 1 & 2
02477ceb
stas00
tjruwase Merge branch 'master' into olruwase/zero_multi_models
2f84032c
loadams Merge branch 'master' into olruwase/zero_multi_models
5ded3a92
tjruwase Merge branch 'master' into olruwase/zero_multi_models
69c1489e
tjruwase Add unit tests
3b868609
tjruwase tjruwase requested a review from loadams loadams 305 days ago
tjruwase Merge branch 'master' into olruwase/zero_multi_models
7edabdb4
tjruwase Formatting
a7744daf
tjruwase Fix CI failures due to curriculum learning
b5c556d2
tjruwase Merge branch 'master' into olruwase/zero_multi_models
c20e3930
tjruwase Merge branch 'master' into olruwase/zero_multi_models
19e9c1d4
tjruwase Merge branch 'master' into olruwase/zero_multi_models
51f7bf64
tjruwase
stas00
stas00 commented on 2025-03-05
tjruwase Update deepspeed/runtime/engine.py
2d267f84
tjruwase Update deepspeed/runtime/engine.py
51de6710
stas00
stas00 approved these changes on 2025-03-06
tjruwase Multiple models with indepdent loss (legacy case)
acc22eec
tjruwase Merge branch 'olruwase/zero_multi_models' of github.com:microsoft/Dee…
86f08f82
tjruwase Update UT and docs
e5a49580
tjruwase Tweak RTD
82372c39
tjruwase
tjruwase Tweak RTD
f50f5e69
loadams Merge branch 'master' into olruwase/zero_multi_models
e7fe8148
tjruwase Merge branch 'olruwase/zero_multi_models' of github.com:microsoft/Dee…
f7792f9c
tjruwase Merge branch 'master' into olruwase/zero_multi_models
3002fcec
tjruwase tjruwase merged b418cf6c into master 291 days ago
tjruwase tjruwase deleted the olruwase/zero_multi_models branch 291 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone