[PyTorch][Static Runtime] Borrow outputs in static_runtime::VarTupleUnpack (#68161)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68161
Continuing rollout of borrowing outputs for native ops.
ghstack-source-id: 143424920
Test Plan:
Compare CMF local_ro perf again.
Previous diff:
```
I1110 22:05:23.245435 113949 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 1.03272. Iters per second: 968.313
I1110 22:05:23.822196 113949 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 1.06478. Iters per second: 939.163
I1110 22:05:24.395256 113949 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 1.035. Iters per second: 966.186
I1110 22:05:24.964169 113949 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 1.02786. Iters per second: 972.898
I1110 22:05:25.536558 113949 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 1.03205. Iters per second: 968.946
I1110 22:05:26.109027 113949 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 1.04256. Iters per second: 959.174
I1110 22:05:26.679611 113949 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 1.03245. Iters per second: 968.567
I1110 22:05:27.253048 113949 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 1.04493. Iters per second: 957.005
I1110 22:05:27.822629 113949 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 1.0299. Iters per second: 970.971
I1110 22:05:28.393326 113949 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 1.03039. Iters per second: 970.509
I1110 22:05:28.393368 113949 PyTorchPredictorBenchLib.cpp:285] Mean milliseconds per iter: 1.03726, standard deviation: 0.0111053
```
This diff:
```
I1110 22:18:48.453075 191647 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 0.931188. Iters per second: 1073.9
I1110 22:18:48.967614 191647 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 0.933196. Iters per second: 1071.59
I1110 22:18:49.483338 191647 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 0.932087. Iters per second: 1072.86
I1110 22:18:49.997144 191647 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 0.930877. Iters per second: 1074.26
I1110 22:18:50.529383 191647 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 0.936981. Iters per second: 1067.26
I1110 22:18:51.085038 191647 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 0.953214. Iters per second: 1049.08
I1110 22:18:51.607192 191647 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 0.940719. Iters per second: 1063.02
I1110 22:18:52.126169 191647 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 0.942638. Iters per second: 1060.85
I1110 22:18:52.644445 191647 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 0.937574. Iters per second: 1066.58
I1110 22:18:53.163486 191647 PyTorchPredictorBenchLib.cpp:274] PyTorch run finished. Milliseconds per iter: 0.941636. Iters per second: 1061.98
I1110 22:18:53.163537 191647 PyTorchPredictorBenchLib.cpp:285] Mean milliseconds per iter: 0.938011, standard deviation: 0.00691196
```
0.099 (9.5%!) usec/iter improvement over previous diff
Reviewed By: hlu1
Differential Revision: D32347900
fbshipit-source-id: 8169ebcadf1248e555a18bbffa99eef6cac1ba85