Remove many output_attentions and other traced outputs on 100+ models (#43590)
* first batch, let's see
* propagate changes
* fixup broken tests
* simplify more models
* hack
* fix attentions
* I'm bullied by the new fix-repo
* this one too
* fix-repo
* ...fix-repo?
* change up
* propagate changes again
* more changes
* fixes
* revert CLAP to the back burner
* more changes
* some broken clap stuff
* New batch of models
* something forgotten?
* biiig batch + handle explicit None values for output_attentions in generic.py
* make fixup
* remove dummy record outputs
* batch, with some difficult ones (dpt)
* ugly fix
* update
* fix
* propagate changes
* some more modifs
* fixup
* be careful about this stuff
* style
* fix repo
* fix more
* loose booleans
* missing docstring
* fix import
* fixing fixing
* hacky kosmos2.5...
* upgrade
* simplifyyy
* and another one
* propagate many changes
* upgraaade
* fix
* ay
* fixes, many
* fiiix
* more more updates
* might slow things down?
* fix fix
* fiix
* fix
* fiiix meeeerge confliiiicts
* style
* modular
* modular
* small update...
* tests broken
* remapping a bunch of kwargs to encoder_kwargs to avoid contamination
* style
* modular mess up
* more fixes
* fix more things
* more fixes
* style
* address comments 1
* argh
* fiiix
* broken kosmos2 why
* decorators
* done
* fixup docs,
* last fixes?
* bring back typing!
* modularities
* modulaaar
* style
* fix compile
* autodocstringss
* The greatest decorator upgrade
the great 2026 decorator upheaval
update everything
fixes?
style
* fix modular
* few fixes
* fixup wrong decorators
* fix gemma
* fix align
* revert numba changes
* fixup pp v2 modular
* remove unnecessary merge with defaults and return tuple + refactor kosmos models
* oops
* fixup repo
* pass kwargs on idefics properly
* new should have beens
* dang copy pasting is not my strong suite
* ... no comment
* fixup some pretrained stuff
* refactor zamba models properly + fixup gradient ckpting within merged defaults
* style
* several fixups
- gradient ckpting
- missing merge defaults
- input shape stuff
- probably even more
* style + fixup
* fix repo
* interesting
* small fixes
* fix align and other small fixes
* fix repo
* this time
* p1
* p2
* fix encoder got cache during generate
* let's try this
* fix
* revert higgs
* fixup models
* fix
* that copy is bad
* new round of fixes
* fixup some backbones
* style
* fix xlstm training inplace op
* fix repo
* revert the kwargs renames
* wrong decorator application
* style
* fix defaults for vision args
* fix concistency
* remove the user requested workaround, always return
* style
* some type annotations
* p1
* p2
* fix repo
* readd the workaround
* fix git
* fix vitpose backbone
* fixup
* quick fixes
* repo
* fix
* fix these
* x clip was implemented so wrong ICANT
* clipseg fixups
* fix repo
* fixup weird hidden states treatment, backbones are inherently already tested --> only need backbone mixin tests
* style
* fix last tests
* change backbones mixin instead with a wrapper always applied to the subclasses
* style
* force dict
* cleanup
* missed this one
* wow this was nasty
* refactor a few models
* fixups
* readd backbones to common tests
* revert gc changes
* fix repo
* explicit decorators
* revert zamba
* fixup the unrelated renaming in dino 3 vit
* fixup some modular stuff
* fixup weird zamba2 workaround
* fixup dino stuff
* idefics simplifications
* small fixes
* fix kosmos 2.5
* fixup
* fix clvp, same failure as main when fixing the return dict stuff
* same fix as in the other pr
* fixup some ci stuff
* style
* revert pr to capturing
---------
Co-authored-by: vasqu <antonprogamer@gmail.com>
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>