transformers
Enable Gradient Accumulation fix across all models + trainer fully in forward()
#34283

Merged

Enable Gradient Accumulation fix across all models + trainer fully in forward() #34283

muellerzr merged 15 commits into main from fixup-loss_fn_issues

muellerzr added Core: Modeling

muellerzr added trainer

muellerzr requested a review from

ArthurZucker 1 year ago

ArthurZucker commented on 2024-10-21

man-shar commented on 2024-10-22

muellerzr requested a review from

BenjaminBossan 1 year ago

BenjaminBossan approved these changes on 2024-10-22

Enable grad accum fix across all models + trainer fully in forward()

0bfce6e2

handle peft case

8c12bf42

Account for DDP: need to run scale tests

058fe34f

Use accelerator state

fc6d6747

muellerzr force pushed to fc6d6747 1 year ago

Quality

0aeb5ac4

Guard

49b29d22

Experiment w/ only fairseq fix

4f3f86d2

Fairseq only

58ee6805

zhijian-liu commented on 2024-10-23

Revert multiply_grads fix

2d58b30b

Mult by grad accum to fully bring back solution

921abb8d

Style

44179844

Good to go now

21ca9a4f

Skip fx tests for now

9967afc0

Bookmark

98cbf7c1

Working now

4e2328d5

muellerzr merged d9f73362 into main 1 year ago

muellerzr deleted the fixup-loss_fn_issues branch 1 year ago

Reviewers

BenjaminBossan

ArthurZucker

zhijian-liu

man-shar

Assignees

No one assigned

Labels

Core: Modeling trainer

Milestone

No milestone

transformers Enable Gradient Accumulation fix across all models + trainer fully in forward() #34283 Merged

Enable Gradient Accumulation fix across all models + trainer fully in forward() #34283

transformers
Enable Gradient Accumulation fix across all models + trainer fully in forward()
#34283

Merged