transformers
Enable Gradient Accumulation fix across all models + trainer fully in forward()
#34283
Merged

Loading