onnxruntime
9810b9e0 - Reduce amount of compiled CUDA device code (#6118)

Commit
5 years ago
Reduce amount of compiled CUDA device code (#6118) Move CudaKernel from cuda_common.h to a new separate header, cuda_kernel.h. Update include sites to use cuda_kernel.h instead if they need CudaKernel. Inclusions of cuda_common.h are now more lightweight. Make corresponding changes for ROCM execution provider code. Other minor cleanup.
Author
Parents
Loading