[CUDA][CUBLAS][BFloat16] Tenatively disable reduced precision reductions for some matmul tests (#92599)
We've observed some failures in numerical checks on newer compute capabilities stemming from cuBLAS allowing reduced precision reductions.
CC @ptrblck @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92599
Approved by: https://github.com/ngimel