Use TORCH_CHECK instead of inappropriate CUDA_KERNEL_ASSERT (#87714)
`CUDA_KERNEL_ASSERT` should only be used inside kernels; switch these bad usages to `TORCH_CHECK`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87714
Approved by: https://github.com/ezyang