Split CUDA SpectralOp (#58459)
Summary:
Move all cuFFT related parts to SpectralOps.cpp
Leave only _fft_fill_with_conjugate_symmetry_cuda_ in SpecralOps.cu
Keep `CUDAHooks.cpp` in torch_cuda_cpp by introducing `at::cuda::detail::THCMagma_init` functor and registering it from global constructor in `THCTensorMathMagma.cu`
Move entire detail folder to torch_cuda_cpp library.
This is a no-op that helps greatly reduce binary size for CUDA-11.x builds by avoiding cufft/cudnn symbol duplication between torch_cuda_cpp(that makes most of cuFFT calls) and torch_cuda_cu (that only needed it to compile SpectralOps.cu)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58459
Reviewed By: ngimel
Differential Revision: D28499001
Pulled By: malfet
fbshipit-source-id: 425a981beb383c18a79d4fbd9b49ddb4e5133291