Make out ops c10-full (with hacky-wrapper) (#48912)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48912
ghstack-source-id: 118619234
(Note: this ignores all push blocking failures!)
Test Plan:
Benchmark:
---
Old (i.e. codegenerated unboxing wrapper + no hacky_wrapper):
```
<torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.CallgrindStats object at 0x7f64d03ebcd0>
torch.absolute(t, out=o)
setup:
t = torch.empty([1])
o = torch.empty([1])
All Noisy symbols removed
Instructions: 657204 634396
Baseline: 4192 3786
100 runs per measurement, 1 thread
```
New (i.e. templated unboxing wrapper + hacky_wrapper):
```
<torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.CallgrindStats object at 0x7fa7de211cd0>
torch.absolute(t, out=o)
setup:
t = torch.empty([1])
o = torch.empty([1])
All Noisy symbols removed
Instructions: 658160 633996
Baseline: 4210 3786
100 runs per measurement, 1 threa
```
Reviewed By: bhosmer
Differential Revision: D25363335
fbshipit-source-id: ab9c122491e4209a49254dad0f7b3adb677b2c53