Enable Gradient Accumulation fix across all models + trainer fully in forward() #34283
Enable grad accum fix across all models + trainer fully in forward()
0bfce6e2
handle peft case
8c12bf42
Account for DDP: need to run scale tests
058fe34f
Use accelerator state
fc6d6747
muellerzr
force pushed
to
fc6d6747
1 year ago
Quality
0aeb5ac4
Guard
49b29d22
Experiment w/ only fairseq fix
4f3f86d2
Fairseq only
58ee6805
Revert multiply_grads fix
2d58b30b
Mult by grad accum to fully bring back solution
921abb8d
Style
44179844
Good to go now
21ca9a4f
Skip fx tests for now
9967afc0
Bookmark
98cbf7c1
Working now
4e2328d5
muellerzr
merged
d9f73362
into main 1 year ago
muellerzr
deleted the fixup-loss_fn_issues branch 1 year ago
Assignees
No one assigned
Labels
Core: Modeling
trainer
Login to write a write a comment.
Login via GitHub