:rotating_light: [`Attn`] New attn mask interface everywhere (#42848)
* fix
* fix order
* style
* vision 3d rope get extra test for now
* fix gpt2
* more gpt2 fixes
* let's see...
* fix
* test
* fix opt+biogpt
* fix
* fix
* fix
* fix opt
* mask exchange test
* style
* several small fixes
* shouldnt be needed
* fix zamba models
* retrigger ci
* force skip for now
* this wont work, will fix step by step
* to git
* another batch
* fix a few models, clip related models are gonna be hard...
* another batch
* style
* fix gpt2 attempt
* another batch + some models do not set their attn implementation? TODO
* fix
* last models
* style
* repo fix
* check
* some quick fixes, error to catch wrong inits in some models
* small fixes
* fixes for wrong mask pretrained model relation
* fix
* remove mask defaulting --> that's part of the prep + fixup some other tests
* small fixes
* fix last few models --> last to check recurrent gemma + repo consistency
* fixup test cleanup
* revert these tests
* these were not necessary, they have a proper top module
* fixup kwargs
* remove old API
* more kwargs
* let's revert this - im in a fork :D
* fix
* dang
* revert removal and add deprecation msg
* kwargs typing
* style