pytorch
f5e72552 - [PyTorch] Save a single add instruction in the dispatcher (#52543)

Commit
4 years ago
[PyTorch] Save a single add instruction in the dispatcher (#52543) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52543 This saves one (1) add instruction. New code comments should explain exactly why. In short, we store a direct pointer in `OperatorHandle` in addition to the `std::list<OperatorDef>::iterator` because converting the latter to the former requires an add instruction. It is not clear to me whether this is a particularly great tradeoff, but I spent (more) time on it (than I expected), so here it is for review. ghstack-source-id: 122147199 Test Plan: Inspect assembly for at::empty in benchmark code -- see add instruction disappeared. Compare empty benchmark performance to baseline with perf stat. Baseline: 5,077.43 msec task-clock # 1.000 CPUs utilized ( +- 0.25% ) 405 context-switches # 0.080 K/sec ( +- 1.37% ) 3 cpu-migrations # 0.001 K/sec ( +- 18.22% ) 12,259 page-faults # 0.002 M/sec ( +- 0.10% ) 10,089,754,343 cycles # 1.987 GHz ( +- 0.25% ) (50.04%) 29,516,000,227 instructions # 2.93 insn per cycle ( +- 0.04% ) (50.08%) 5,662,629,032 branches # 1115.256 M/sec ( +- 0.02% ) (50.08%) 1,955,729 branch-misses # 0.03% of all branches ( +- 0.88% ) (50.04%) 5.0796 +- 0.0128 seconds time elapsed ( +- 0.25% ) After: ``` 5,017.77 msec task-clock # 1.001 CPUs utilized ( +- 0.19% ) 400 context-switches # 0.080 K/sec ( +- 3.09% ) 4 cpu-migrations # 0.001 K/sec ( +- 46.91% ) 12,240 page-faults # 0.002 M/sec ( +- 0.37% ) 9,960,189,535 cycles # 1.985 GHz ( +- 0.19% ) (50.02%) 29,467,149,773 instructions # 2.96 insn per cycle ( +- 0.11% ) (50.03%) 5,661,074,219 branches # 1128.206 M/sec ( +- 0.02% ) (50.07%) 2,032,712 branch-misses # 0.04% of all branches ( +- 1.35% ) (50.07%) 5.0151 +- 0.0101 seconds time elapsed ( +- 0.20% ) ``` 1.2% cycles win, outside the noise 0.16% instruction count win, barely outside noise I am surprised at the size of the cycles win. Reviewed By: bhosmer Differential Revision: D26564192 fbshipit-source-id: 71f731ba54ec1cb407673db691eaf77a257de4a9
Author
Parents
Loading