transformers
Enable Gradient Accumulation fix across all models + trainer fully in forward()
#34283
Merged

Enable Gradient Accumulation fix across all models + trainer fully in forward() #34283

muellerzr merged 15 commits into main from fixup-loss_fn_issues
muellerzr
muellerzr muellerzr added Core: Modeling
muellerzr muellerzr added trainer
muellerzr muellerzr requested a review from ArthurZucker ArthurZucker 1 year ago
ArthurZucker
ArthurZucker commented on 2024-10-21
muellerzr
man-shar
man-shar commented on 2024-10-22
man-shar
muellerzr muellerzr requested a review from BenjaminBossan BenjaminBossan 1 year ago
BenjaminBossan
BenjaminBossan approved these changes on 2024-10-22
HuggingFaceDocBuilderDev
ArthurZucker
muellerzr Enable grad accum fix across all models + trainer fully in forward()
0bfce6e2
muellerzr handle peft case
8c12bf42
muellerzr Account for DDP: need to run scale tests
058fe34f
muellerzr Use accelerator state
fc6d6747
muellerzr muellerzr force pushed to fc6d6747 1 year ago
muellerzr Quality
0aeb5ac4
muellerzr Guard
49b29d22
muellerzr Experiment w/ only fairseq fix
4f3f86d2
muellerzr Fairseq only
58ee6805
zhijian-liu
zhijian-liu commented on 2024-10-23
zhijian-liu
zhijian-liu commented on 2024-10-23
muellerzr Revert multiply_grads fix
2d58b30b
muellerzr Mult by grad accum to fully bring back solution
921abb8d
muellerzr Style
44179844
muellerzr Good to go now
21ca9a4f
muellerzr Skip fx tests for now
9967afc0
muellerzr Bookmark
98cbf7c1
muellerzr Working now
4e2328d5
muellerzr muellerzr merged d9f73362 into main 1 year ago
muellerzr muellerzr deleted the fixup-loss_fn_issues branch 1 year ago
JaheimLee
ArthurZucker

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone