Fix Gradient Accumulation issue (#34191)
* quick fix
* 3 losses
* oups
* fix
* nits
* check how it scales for special models
* propagate for conditiona detr
* propagate
* propagate
* propagate
* fixes
* propagate changes
* update
* fixup
* nits
* f string
* fixes
* more fixes
* ?
* nit
* arg annoying f string
* nits
* grumble
* update
* nit
* refactor
* fix fetch tests
* nit
* nit
* Update src/transformers/loss/loss_utils.py
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
* update
* nit
* fixup
* make pass
* nits
* port code to more models
* fixup
* ntis
* arf
* update
* update
* nits
* update
* fix
* update
* nits
* fine
* agjkfslga.jsdlkgjklas
* nits
* fix fx?
* update
* update
* styel
* fix imports
* update
* update
* fixup to fix the torch fx?
---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>