Separate TLS for InferenceMode (#55424)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55424
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55238
I tried to avoid creating new TLS, but InferenceMode::is_enabeld()
is in perf critical path (TensorImpl constructor) so it seems
worth adding one for it.
This PR reduces one sources of instruction count increased by
https://github.com/pytorch/pytorch/pull/55008.
```
λ ~ python compare.py
<torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts object at 0x7f59097ef310>
100 0x0000000004854750
-100 0x0000000004854760
-4400 c10::impl::tls_is_dispatch_key_included(...)
```
Test Plan: Imported from OSS
Reviewed By: ezyang
Differential Revision: D27539230
Pulled By: ailzhang
fbshipit-source-id: e040877faef966dca3c2c3d5f9e9a80496c81415