pytorch
8acb74c4 - [PyTorch] Make IValue::toTensor() inlineable (#53213)

Commit
3 years ago
[PyTorch] Make IValue::toTensor() inlineable (#53213) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53213 The failure path for toTensor() is fairly long because it has to stringify tagKind() and construct a std::string. Forcibly outlining it should allow inlining the happy path. ghstack-source-id: 123012703 Test Plan: 1) Compare perf profile on AdIndexer benchmark before/after -- toTensor frames no longer show up, demonstrating inlining 2) Compare perf stat results on AdIndexer benchmark before/after: Before: ``` 17,104.66 msec task-clock # 0.999 CPUs utilized ( +- 0.26% ) 3,666 context-switches # 0.214 K/sec ( +- 18.53% ) 3 cpu-migrations # 0.000 K/sec ( +- 6.25% ) 102,745 page-faults # 0.006 M/sec ( +- 0.47% ) 33,860,604,938 cycles # 1.980 GHz ( +- 0.25% ) (50.02%) 69,514,752,652 instructions # 2.05 insn per cycle ( +- 0.06% ) (50.01%) 11,280,877,966 branches # 659.521 M/sec ( +- 0.11% ) (50.01%) 75,739,099 branch-misses # 0.67% of all branches ( +- 0.98% ) (50.03%) # Table of individual measurements: 17.2467 (+0.1172) # 17.0014 (-0.1280) # 17.2134 (+0.0840) # 17.0951 (-0.0343) # 17.0905 (-0.0389) # # Final result: 17.1294 +- 0.0447 seconds time elapsed ( +- 0.26% ) ``` After: ``` 16,910.66 msec task-clock # 0.999 CPUs utilized ( +- 0.27% ) 3,495 context-switches # 0.207 K/sec ( +- 18.34% ) 3 cpu-migrations # 0.000 K/sec ( +- 6.25% ) 101,769 page-faults # 0.006 M/sec ( +- 0.45% ) 33,460,776,952 cycles # 1.979 GHz ( +- 0.28% ) (50.03%) 69,243,346,925 instructions # 2.07 insn per cycle ( +- 0.17% ) (50.02%) 11,229,930,860 branches # 664.074 M/sec ( +- 0.14% ) (50.03%) 72,273,324 branch-misses # 0.64% of all branches ( +- 0.55% ) (50.03%) # Table of individual measurements: 16.9530 (+0.0246) # 17.0898 (+0.1614) # 16.8493 (-0.0791) # 16.8282 (-0.1002) # 16.9217 (-0.0067) # # Final result: 16.9284 +- 0.0464 seconds time elapsed ( +- 0.27% ) ``` 1.1% cycles win, 0.38% instructions win, both apparently outside noise level Reviewed By: smessmer Differential Revision: D26793481 fbshipit-source-id: b035b3ad20f9e22ae738d91163641031b1130ce6
Author
Parents
Loading