llvm-project
24833808 - [CIR][CUDA][HIP] Support stream per thread kernel launch (#188004)

Commit
29 days ago
[CIR][CUDA][HIP] Support stream per thread kernel launch (#188004) Related: #175871, #179278 When `-fgpu-default-stream=per-thread` is specified, CUDA and HIP kernels should be launched using the per-thread stream variants of the launch API instead of the default `cudaLaunchKernel`/`hipLaunchKernel`. This PR implements that by selecting the correct launch function name in `emitDeviceStubBodyNew`: For CUDA: `cudaLaunchKernel_ptsz` For HIP: `hipLaunchKernel_spt` This matches the behavior of the OG CodeGen implementation in `CGCUDANV.cpp`.
Parents
Loading