[Inductor] Add new fused_attention pattern matcher (#107578)
Add new fused_attention pattern matcher for Inductor, in order to make more models call the op SDPA.
The following models would call SDPA due to the added pattern:
For HuggingFace
- AlbertForMaskedLM
- AlbertForQuestionAnswering
- BertForMaskedLM
- BertForQuestionAnswering
- CamemBert
- ElectraForCausalLM
- ElectraForQuestionAnswering
- LayoutLMForMaskedLM
- LayoutLMForSequenceClassification
- MegatronBertForCausalLM
- MegatronBertForQuestionAnswering
- MobileBertForMaskedLM
- MobileBertForQuestionAnswering
- RobertaForCausalLM
- RobertaForQuestionAnswering
- YituTechConvBert
For TorchBench
- llama
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107578
Approved by: https://github.com/mingfeima, https://github.com/XiaobingSuper, https://github.com/jgong5, https://github.com/eellison, https://github.com/jansel