[PyTorch] check isValidUnboxed() in the dispatcher (#51247)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51247
See code comment for explanation.
This measures neutral compared to the previous diff with `perf stat` when running on a
benchmark that calls empty in a loop. I think that we should commit it
anyway because:
1) I have previously seen it make a difference when applied earlier in
the stack.
2) This makes sense both on principle and via inspecting output
assembly: we avoid having to touch the boxed kernel at all (usually)
and instead use the unboxed kernel for both the validity check in
`OperatorEntry::lookup` and the actual `KernelFunction::call`.
ghstack-source-id: 120697497
Test Plan: Aforementioned perf measurement
Reviewed By: ezyang
Differential Revision: D26113650
fbshipit-source-id: 8448c4ed764d477f63eb7c0f6dd87b1fc0228b73