transformers
[trainer] deepspeed integration
#9211
Merged

[trainer] deepspeed integration #9211

stas00 merged 69 commits into huggingface:master from stas00:ds
stas00
stas00 deepspeed integration
b99b6653
stas00 style
cf2f0d2f
stas00 add test
d417f553
patrickvonplaten
patrickvonplaten commented on 2020-12-21
patrickvonplaten
patrickvonplaten commented on 2020-12-21
patrickvonplaten
patrickvonplaten approved these changes on 2020-12-21
sgugger
sgugger approved these changes on 2020-12-21
stas00 ds wants to do its own backward
112be601
stas00 fp16 assert
4c2809dc
jeffra
stas00
stas00 Update src/transformers/training_args.py
f4de6ff0
stas00 Merge branch 'ds' of github.com:stas00/transformers into ds
bd350f6c
stas00 style
8565c8a6
stas00 Merge remote-tracking branch 'origin/master' into ds
9653f8eb
stas00 Merge remote-tracking branch 'origin/master' into ds
e980e21b
stas00 for clarity extract what args are being passed to deepspeed
9cc3b63d
stas00 introduce the concept of self.wrapped_model
15104443
stas00
sgugger
stas00
g-karthik
stas00
stas00 s/self.wrapped_model/self.model_wrapped/
f28566e0
stas00 complete transition to self.wrapped_model / self.model
caa32dc7
stas00 fix
9f199e71
stas00 doc
fb0c13eb
stas00 give ds its own init
6af645f3
stas00 add custom overrides, handle bs correctly
765594df
stas00 fix test
8ebe5c74
stas00 clean up model_init logic, fix small bug
e8c10804
stas00 complete fix
3b7c5815
stas00 collapse --deepspeed_config into --deepspeed
aa8f9a13
stas00 style
a83c46a5
stas00 start adding doc notes
aaf97e10
stas00 style
3dedd7a8
stas00 Merge remote-tracking branch 'origin/master' into ds
af22ec1d
stas00 implement hf2ds optimizer and scheduler configuration remapping
869173f6
stas00
stas00 oops
2f73bfe6
sgugger
sgugger commented on 2021-01-04
stas00 call get_num_training_steps absolutely when needed
9c778238
stas00
stas00
stas00 workaround broken auto-formatter
e6610dae
stas00 deepspeed_config arg is no longer needed - fixed in deepspeed master
a1ed8387
stas00 use hf's fp16 args in config
25a93d9b
stas00
sgugger
sgugger commented on 2021-01-05
sgugger
stas00 clean
39f64467
stas00 Merge remote-tracking branch 'origin/master' into ds
b3667b08
stas00 start on the docs
df1154c8
stas00 Merge remote-tracking branch 'origin/master' into ds
c5afaecb
stas00
stas00 commented on 2021-01-07
stas00 rebase cleanup
c9a6266f
g-karthik
g-karthik commented on 2021-01-07
stas00 finish up --fp16
8bb2462a
stas00
stas00
stas00 clarify the supported stages
c9e7de70
sgugger
stas00
sgugger
stas00
sgugger
stas00 Merge remote-tracking branch 'origin/master' into ds
33087ca9
stas00 big refactor thanks to discovering deepspeed.init_distributed
dc00de77
stas00
stas00 cleanup
ad967b45
stas00 revert fp16 part
52d80009
stas00 add checkpoint-support
31dba171
stas00 more init ds into integrations
6c5432a2
stas00 extend docs
8364b277
stas00
stas00 cleanup
f462b5c8
stas00 unfix docs
152c2392
sgugger
sgugger approved these changes on 2021-01-08
sgugger sgugger requested a review from LysandreJik LysandreJik 4 years ago
stas00 Merge remote-tracking branch 'origin/master' into ds
38ce263f
stas00 clean up old code
2d4de17b
stas00 imports
f90822f0
stas00 move docs
42260fe4
stas00 fix logic
dda4a44c
tjruwase
tjruwase commented on 2021-01-08
stas00 make it clear which file it's referring to
f9d60287
stas00 document nodes/gpus
57f58883
stas00 style
c8ef31e4
stas00
stas00 wrong format
c0f5e1b8
stas00 style
20e3ab6c
stas00 deepspeed handles gradient clipping
777ae49c
stas00 easier to read
0bbc65e8
stas00 major doc rewrite
c65a6808
stas00
sgugger
sgugger approved these changes on 2021-01-11
stas00 Apply suggestions from code review
4b9dd767
stas00 docs
36c6f57d
stas00 switch to AdamW optimizer
7f5e5797
stas00 style
b210ae30
stas00 Merge remote-tracking branch 'origin/master' into ds
96b7d3ad
stas00
LysandreJik
LysandreJik approved these changes on 2021-01-12
stas00 Apply suggestions from code review
d8da0c7b
stas00 clarify doc
19ad552a
stas00 Merge remote-tracking branch 'origin/master' into ds
19e4972c
stas00
stas00 stas00 merged 2df34f4a into master 4 years ago
stas00 stas00 deleted the ds branch 4 years ago
patrickvonplaten
stas00 stas00 added DeepSpeed

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone