pytorch
665d5e2a - [PyTorch][JIT] Audit interpreter for extra copies (#54029)

Commit
3 years ago
[PyTorch][JIT] Audit interpreter for extra copies (#54029) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54029 I found what appear to be some missed moves and/or extra copies in the JIT interpreter. ghstack-source-id: 123958682 Test Plan: Existing CI for correctness Ran AdIndexer inline_cvr local_ro model benchmark with static_runtime off via `env bin=/tmp/ptvsc2_predictor_bench.StaticDispatchModeFile static_runtime=0 caffe2=0 scripts/swolchok/static_runtime/inline_cvr/run_local_ro.sh` before: ``` I0315 14:25:23.916893 3075680 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.01635. Iters per second: 983.914 I0315 14:26:05.536207 3080560 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.01689. Iters per second: 983.395 I0315 14:26:47.510561 3083335 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.02697. Iters per second: 973.737 I0315 14:27:29.024830 3086767 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.01326. Iters per second: 986.918 I0315 14:28:10.849496 3091323 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.023. Iters per second: 977.517 ``` after: ``` I0315 14:17:43.280469 3046242 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 0.997838. Iters per second: 1002.17 I0315 14:18:24.244606 3046861 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.00173. Iters per second: 998.269 I0315 14:19:05.208899 3051998 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.00187. Iters per second: 998.136 I0315 14:19:46.103854 3055392 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.00073. Iters per second: 999.27 I0315 14:20:27.011411 3056062 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 0.999121. Iters per second: 1000.88 ``` (This was just a convenient workload I had handy; the plan of record is to use static runtime for inline_cvr inference AIUI.) Reviewed By: dhruvbird, walterddr Differential Revision: D27060762 fbshipit-source-id: 5567206d7c2d9ae99776ce5524caf09ec2035e87
Author
Parents
Loading