pytorch
1fb5cf5a - Migrate `_th_std_var` to ATen (#59258)

Commit
3 years ago
Migrate `_th_std_var` to ATen (#59258) Summary: Ref https://github.com/pytorch/pytorch/issues/49421 This migrates `std`/`var`'s special case all-reduction from TH to ATen. Using the benchmark from gh-43858 that was used to justify keeping the TH version; I find this PR has similar (slightly better) performance in single threaded. And unlike the TH version, this is multi-threaded and so much faster for large tensors. TH Results: ``` [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 1 threads: --------------------------------------------------------- 8 | 3.6 | 3.8 | 8.2 | 1.2 80 | 3.7 | 3.8 | 8.4 | 1.2 800 | 4.2 | 4.3 | 8.7 | 1.2 8000 | 9.0 | 9.1 | 11.2 | 1.5 80000 | 58.3 | 59.0 | 30.6 | 4.2 800000 | 546.9 | 546.9 | 183.4 | 31.3 8000000 | 5729.7 | 5701.0 | 6165.4 | 484.1 ``` ATen results: ``` [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 1 threads: --------------------------------------------------------- 8 | 4.0 | 4.0 | 8.7 | 1.2 80 | 3.6 | 3.8 | 9.0 | 1.2 800 | 4.1 | 4.3 | 8.9 | 1.2 8000 | 8.9 | 9.2 | 10.6 | 1.5 80000 | 57.0 | 57.4 | 28.8 | 4.3 800000 | 526.9 | 526.9 | 178.3 | 30.2 8000000 | 5568.1 | 5560.6 | 6042.1 | 453.2 [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 8 threads: --------------------------------------------------------- 8 | 3.9 | 3.8 | 9.1 | 1.2 80 | 3.8 | 3.9 | 8.8 | 1.2 800 | 4.2 | 4.3 | 8.9 | 1.3 8000 | 9.0 | 9.2 | 10.4 | 1.5 80000 | 26.0 | 26.8 | 26.4 | 4.4 800000 | 92.9 | 87.3 | 72.1 | 22.4 8000000 | 793.5 | 791.8 | 5334.8 | 115.1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/59258 Reviewed By: mruberry Differential Revision: D28821216 Pulled By: ngimel fbshipit-source-id: f35992c21f08a0a8878053680dc0ca7a8facd155
Author
Parents
Loading