[CUDA] Implement __CUDA_ARCH_LIST__ macro and refactor architecture helpers (#175260)
Closes #172937
https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#virtual-architecture-macros
> The architecture list macro `__CUDA_ARCH_LIST__` is a list of
comma-separated `__CUDA_ARCH__` values for each of the virtual
architectures specified in the compiler invocation. The list is sorted
in numerically ascending order.
Note that unlike NVCC which defines the macro for all C/C++/CUDA
compilations done with nvcc, clang defines the macro *only* for CUDA
compilations.