[Core] MobileClip Attention Fusion (#27883)
### Description
Update the Attention Fusion optimizer to help fuse the Attention
subgraph pattern in MobileClip model. The perf gain from this itself is
paltry (mostly from not having to launch many kernels) but the real gain
will be AFTER this fusion (i.e.) tuning the performance of the MHA
kernel for the problem shapes seen in this model.
There are 2 Attention blocks found in the model and this update fuses
both of them.
### Motivation and Context
Improve performance of MobileClip model
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>