[AOTInductor] Store loaded kernels in the model (#110554)
Defining kernels as static vars is problematic for subsequent model loading on non-default CUDA devices.
Assuming those kernels were loaded in context of the device #0, so, they are not nullptr anymore, therefore kernels won't work on devices other than the device #0.
This change makes devices remembered at model level in AOT mode.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110554
Approved by: https://github.com/chenyang78, https://github.com/desertfire