[Model] Refactor modernbert with the attention interface (#43030)
* Refactor FA interface
* Simplify code
* Update
* Simplify the code
* Remove comments
* Fix UT
* Fix _attn_implementation UT
* Fix TimmWrapperForImageClassification sdpa dispatch UT
* Remove unused parameter
* Revise based on comments
* Merge main branch
* Fix CI error
* Fix UT
* Refine code based comments
* check_config_attributes
* Fix UT errors
* Refine code
* Revert test_modeling_common.py
* Revert file
* Revert file
* Fix rotary(.float())
* Fix silding window errors
* Update modeling files
* Fix
* Fix flash sdpa eager
* style
* Merge main
* Fix CI error
* Update
* Update the doc
* Update code based comments
* make fix-repo
* Fix CI
* check_config_attributes
* fix
* refactor tests a bit more + fa integration test
* style
* Fix UTs
* resolve conflicts
* Update src/transformers/models/modernbert/configuration_modernbert.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Revised based on comments.
---------
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
Co-authored-by: vasqu <antonprogamer@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>