Move TF building to an actual build() method (#23760)
* A fun new PR where I break the entire codebase again
* A fun new PR where I break the entire codebase again
* Handle cross-attention
* Move calls to model(model.dummy_inputs) to the new build() method
* Seeing what fails with the build context thing
* make fix-copies
* Let's see what fails with new build methods
* Fix the pytorch crossload build calls
* Fix the overridden build methods in vision_text_dual_encoder
* Make sure all our build methods set self.built or call super().build(), which also sets it
* make fix-copies
* Remove finished TODO
* Tentatively remove unneeded (?) line
* Transpose b in deberta correctly and remove unused threading local
* Get rid of build_with_dummies and all it stands for
* Rollback some changes to TF-PT crossloading
* Correctly call super().build()