[PyTorch] Make TORCH_CHECK less likely to interfere with inlining (#49263)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49263
Now it is smaller and calls to an out-of-line function in
case of failure.
ghstack-source-id: 118480531
Test Plan:
1) Inspect perf profile of internal benchmark, much less
time spent in (for example) `c10::impl::getDeviceImpl`, which calls
TORCH_CHECK and should be inlined
2) Internal benchmarks
Reviewed By: smessmer
Differential Revision: D25481308
fbshipit-source-id: 0121ada779ca2518ca717f75920420957b3bb1aa