pytorch
059ee85c - [PyTorch] Devirtualize TensorImpl::storage() (#51050)

Commit View On GitHub

Commit

3 years ago

[PyTorch] Devirtualize TensorImpl::storage() (#51050) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51050 Subclasses want to be able to make storage() calls throw, so we find some free space in TensorImpl to add a flag that they can set to make that happen without making storage() virtual. It should still be inlineable. ghstack-source-id: 121819684 Test Plan: Compared `perf stat` on 1M iterations on AdIndexer benchmark before/after Before: ``` 74,483.15 msec task-clock # 0.999 CPUs utilized ( +- 0.14% ) 16,637 context-switches # 0.223 K/sec ( +- 11.97% ) 3 cpu-migrations # 0.000 K/sec ( +- 7.20% ) 107,085 page-faults # 0.001 M/sec ( +- 2.39% ) 147,356,440,831 cycles # 1.978 GHz ( +- 0.14% ) (50.06%) 278,678,430,378 instructions # 1.89 insn per cycle ( +- 0.01% ) (50.05%) 43,540,698,177 branches # 584.571 M/sec ( +- 0.01% ) (50.05%) 141,028,843 branch-misses # 0.32% of all branches ( +- 1.00% ) (50.05%) ``` After: ``` 74,178.77 msec task-clock # 0.999 CPUs utilized ( +- 0.31% ) 17,125 context-switches # 0.231 K/sec ( +- 3.41% ) 3 cpu-migrations # 0.000 K/sec 109,535 page-faults # 0.001 M/sec ( +- 1.04% ) 146,803,364,372 cycles # 1.979 GHz ( +- 0.30% ) (50.03%) 277,726,600,254 instructions # 1.89 insn per cycle ( +- 0.02% ) (50.03%) 43,299,659,815 branches # 583.720 M/sec ( +- 0.03% ) (50.03%) 130,504,094 branch-misses # 0.30% of all branches ( +- 1.14% ) (50.03%) ``` Looks like approximately 0.3% instruction count win (and similarly for cycles, but that's within noise). Reviewed By: ezyang Differential Revision: D26013815 fbshipit-source-id: 07939957929070e18b9981d492d8279c9bb33c55

Author

swolchok

Committer

facebook-github-bot

Parents

4305609d

pytorch 059ee85c - [PyTorch] Devirtualize TensorImpl::storage() (#51050)

Commit

pytorch
059ee85c - [PyTorch] Devirtualize TensorImpl::storage() (#51050)