Fix `CUDA_MAX_THREADS_PER_SM` for `sm_89`
Basically the same as #88644, to fix warnings like
`ptxas warning : Value of threads per SM for entry _ZN2at6native13reduce_kernelILi512ELi1ENS0_8ReduceOpIfNS0_10NormTwoffEEjfLi4EEEEEvT1_ is out of range. .minnctapersm will be ignored`
CC @ptrblck @ngimel