transformers
d9f73362 - Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283)

Commit

1 year ago

Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283) * Enable grad accum fix across all models + trainer fully in forward() * handle peft case * Account for DDP: need to run scale tests * Use accelerator state * Quality * Guard * Experiment w/ only fairseq fix * Fairseq only * Revert multiply_grads fix * Mult by grad accum to fully bring back solution * Style * Good to go now * Skip fx tests for now * Bookmark * Working now

References

#34283 - Enable Gradient Accumulation fix across all models + trainer fully in forward()

Author

muellerzr

Parents

1fb575fc

transformers d9f73362 - Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283)

transformers
d9f73362 - Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283)