[static runtime] dequantize out variant (#68664)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68664
Reland D32187063 (https://github.com/pytorch/pytorch/commit/f120335643b570dbebb5020ea92624f22276a498), fixing lint
Add out variant for aten::dequantize
Test Plan:
Test on inline_cvr model
```
MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 numactl -m 0 -C 3 ./buck-out/opt/gen/caffe2/caffe2/fb/predictor/ptvsc2_predictor_bench --scripted_model=/data/users/ansha/tmp/adfinder/294738512/294738512_0.predictor.disagg.local --recordio_inputs=/data/users/ansha/tmp/adfinder/294738512/294738512_0_local.inputs.recordio --pt_enable_static_runtime=1 --compare_results=1 --iters=5 --warmup_iters=5 --num_threads=1 --do_profile=1 --method_name=local.forward --set_compatibility --do_benchmark=1 --recordio_use_ivalue_format=1
```
Before:
0.047472 ms. 0.409729%. aten::dequantize (9 nodes)
After
0.0307179 ms. 0.267204%. static_runtime::dequantize_copy (9 nodes, out variant)
Test on ctr_mbl_feed model 307210374 on 696 inputs
Before:
0.0569016 ms. 0.296647%. aten::dequantize (10 nodes)
After:
0.0423128 ms. 0.220481%. static_runtime::dequantize_copy (10 nodes, out variant)
Reviewed By: mikeiovine
Differential Revision: D32566429
fbshipit-source-id: b95dfc4c5e4115e083794093bc1571c7b1d72f5b