:red_circle: Update CLIP vision attention to new attention interface (#37498)
* update attention interface
* fix test
* propagate attention changes
* revert weird changes
* fix modular
* what?
* ruff is mocking me
* ruff being ruff
* simplify test suite + fix FA2
* fixup tests + propagate FA2 fixes
* add Copied From where relevant
* fix conflict between copies and modular
* recover FA2 training for CLIP + handle quantization
* don't ditch the warning
* tiny import fix
* code review (FA2 support, copied from)
* fix style
* modularity
* wrong copies
* future-proofing for TP
* mlcd inherits from CLIP