fix: address review feedback - SafeInt AlignTo, y_bnsh H_v, ORT_ENFORCE
- AlignTo now accepts SafeInt<size_t> to maintain overflow protection
through alignment arithmetic (fixes SafeInt gap).
- y_bnsh_bytes uses H_v (v_head_size) instead of H (head_size) for the
Y output buffer to prevent latent under-allocation if v_head_size
ever differs from head_size.
- Add ORT_ENFORCE(head_size == v_head_size) assertion in
UnfusedGqaAttention to make the invariant explicit.