Add LazyNVRTC (#45674)
Summary:
Instead of dynamically loading `caffe2_nvrtc`, lazyNVRTC provides the same functionality by binding all the hooks to lazy bind implementation, very similar to the shared library jump tables:
On the first call, each function from the list tries to get a global handle to the respective shared library and replace itself with the dynamically resolved symbol, using the following template:
```
auto fn = reinterpret_cast<decltype(&NAME)>(getCUDALibrary().sym(C10_SYMBOLIZE(NAME)));
if (!fn)
throw std::runtime_error("Can't get" ## NAME);
lazyNVRTC.NAME = fn;
return fn(...)
```
Fixes https://github.com/pytorch/pytorch/issues/31985
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45674
Reviewed By: ezyang
Differential Revision: D24073946
Pulled By: malfet
fbshipit-source-id: 1479a75e5200e14df003144625a859d312885874