pytorch
673687e7 - [PyTorch] Refactor Dispatcher to inline less code in fast path (#51163)

Commit
3 years ago
[PyTorch] Refactor Dispatcher to inline less code in fast path (#51163) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51163 The Dispatcher seems to have been in a precarious local maximum: I tried to make several different changes to parameter passing and ended up with regressions due to reduced inlining that swamped any gains I might have gotten from the parameter passing changes. This diff reduces the amount of inline code on the fast path. It should both reduce code size and provide a platform for making further improvements to the dispatcher code. It is a slight performance regression, but it unblocked the following two diffs (which seem to get us back where we were) from landing. ghstack-source-id: 120693163 Test Plan: CI, framework overhead benchmarks to check the size of the regression Compared timing for empty framework overhead benchmark before/after. Build command: `buck build mode/no-gpu //caffe2/caffe2/fb/high_perf_models/pytorch/benchmark_framework_overheads:cpp_benchmark mode/opt-clang --show-output` Run with `numactl -m 0 -C 3 path/to/cpp_benchmark -op empty -niter 100` Before: ``` I0126 16:02:04.373075 2135872 bench.cpp:139] Mean 0.266272 I0126 16:02:04.373106 2135872 bench.cpp:140] Median 0.266347 I0126 16:02:04.373111 2135872 bench.cpp:141] Min 0.263585 I0126 16:02:04.373117 2135872 bench.cpp:142] stddev 0.0021264 I0126 16:02:04.373131 2135872 bench.cpp:143] stddev / mean 0.00798581 ``` After: ``` I0126 16:02:30.377992 2137048 bench.cpp:139] Mean 0.27579 I0126 16:02:30.378023 2137048 bench.cpp:140] Median 0.275281 I0126 16:02:30.378029 2137048 bench.cpp:141] Min 0.270617 I0126 16:02:30.378034 2137048 bench.cpp:142] stddev 0.00308287 I0126 16:02:30.378044 2137048 bench.cpp:143] stddev / mean 0.0111783 ``` Yes, it's a regression, but I compared D26069629 stacked on this diff vs not: With this diff: ``` I0126 16:02:50.662864 2137574 bench.cpp:139] Mean 0.268645 I0126 16:02:50.662891 2137574 bench.cpp:140] Median 0.267485 I0126 16:02:50.662896 2137574 bench.cpp:141] Min 0.266485 I0126 16:02:50.662901 2137574 bench.cpp:142] stddev 0.00219359 I0126 16:02:50.662915 2137574 bench.cpp:143] stddev / mean 0.00816537 ``` Without: ``` I0126 20:40:27.815824 3240699 bench.cpp:139] Mean 0.270755 I0126 20:40:27.815860 3240699 bench.cpp:140] Median 0.268998 I0126 20:40:27.815866 3240699 bench.cpp:141] Min 0.268306 I0126 20:40:27.815873 3240699 bench.cpp:142] stddev 0.00260365 I0126 20:40:27.815886 3240699 bench.cpp:143] stddev / mean 0.00961624 ``` So we do seem to have accomplished something w.r.t. not overwhelming the inliner. Reviewed By: ezyang Differential Revision: D26091377 fbshipit-source-id: c9b7f4e187059fa15452b7c75fc29816022b92b1
Author
Parents
Loading