TARGETs changes for flash attention and cutlass (#84781)
Summary: Integrate flash attention and use it when the inputs align just right
Test Plan: Unit tests and such
Differential Revision: D39364603
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84781
Approved by: https://github.com/mikaylagawarecki