pytorch
152d65ae - [reland][inductor] Enable CudaWrapperCodeGen for non-AOT mode (#98534)

Commit
2 years ago
[reland][inductor] Enable CudaWrapperCodeGen for non-AOT mode (#98534) Summary: This is a reland of #98264. When _inductor.config.cpp_wrapper is specified, we run a two-pass wrapper codegen to generate wrapper code in cpp which calls cuLaunchKernel to launch pre-compiled cuda kernels, and then call load_inline to load that generated wrapper back into the python world. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98534 Approved by: https://github.com/huydhn
Author
Committer
Parents
Loading