[PyTorch] inline Dispatcher::singleton (#50644)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50644
The dispatcher is a very hot code path; not inlining
`Dispatcher::singleton()` was hurting perf.
Test Plan:
Profiled our internal empty() benchmark. `perf stat` shows
about a 1.7% reduction in cycles spent; the benchmark's timing itself
shows a small reduction.
Reviewed By: dzhulgakov, bhosmer
Differential Revision: D25935275
fbshipit-source-id: a328f8ac8ea479bbe5c6ddb80f98838ae6058bbd