[SDPA] Update dispatch logic to check for sm86 and head_size == 128 for flash attention (#94921)
Fixes #94883
Where backward for flash_attention on sm86 hardware with head_size == 128 is not supported.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94921
Approved by: https://github.com/cpuhrsch, https://github.com/albanD