[Inductor] Record cpp kernel in PyTorch Profiler (#89367)
Add an option `config.cpp.enable_kernel_profile` to record individual cpp kernel time in PyTorch Profiler.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89367
Approved by: https://github.com/jansel