[SDP] Fix alignment check for efficient_attention (#90413)
Fixes a bug found using head_dim_size==100 on an a100 gpu. This PR contains stricter guards on the input shape. These constraints are taken from xformers: https://github.com/facebookresearch/xformers/blob/gh/danthe3rd/60/orig/xformers/ops/fmha/cutlass.py#L23
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90413
Approved by: https://github.com/mikekgfb