pytorch
8a1f42b8 - Speed up threshold on CPU. (#27155)

Commit
5 years ago
Speed up threshold on CPU. (#27155) Summary: This is a small fix, but the runtime improvement does seem consistent (a bit less than 10%): Benchmark (no turbo, Release build, gcc 8.3, RHEL 7.7, Intel(R) Core(TM) i7-8850H): ```python import timeit for dtype in ('torch.double', 'torch.float', 'torch.int16', 'torch.int32', 'torch.int64'): print(f'dtype={dtype}') for n, t in [(70_000, 200000), (700_000, 20000)]: print(f'torch.nn.Threshold(0.1, 20)(a), numel() == {n} for {t} times') print(timeit.timeit(f'm(a)', setup=f'import torch; m=torch.nn.Threshold(0.1, 20); a = torch.arange({n}, dtype={dtype})', number=t)) ``` Before: ``` dtype=torch.double torch.nn.Threshold(0.1, 20)(a), numel() == 70000 for 200000 times 8.88117562699972 torch.nn.Threshold(0.1, 20)(a), numel() == 700000 for 20000 times 9.525143070000013 dtype=torch.float torch.nn.Threshold(0.1, 20)(a), numel() == 70000 for 200000 times 5.673380930000349 torch.nn.Threshold(0.1, 20)(a), numel() == 700000 for 20000 times 3.677610996000112 dtype=torch.int16 torch.nn.Threshold(0.1, 20)(a), numel() == 70000 for 200000 times 3.957677209999929 torch.nn.Threshold(0.1, 20)(a), numel() == 700000 for 20000 times 1.8512293700005102 dtype=torch.int32 torch.nn.Threshold(0.1, 20)(a), numel() == 70000 for 200000 times 5.624350482999944 torch.nn.Threshold(0.1, 20)(a), numel() == 700000 for 20000 times 3.670380037000541 dtype=torch.int64 torch.nn.Threshold(0.1, 20)(a), numel() == 70000 for 200000 times 8.86375758200029 torch.nn.Threshold(0.1, 20)(a), numel() == 700000 for 20000 times 9.468234717999621 ``` After: ``` dtype=torch.double torch.nn.Threshold(0.1, 20)(a), numel() == 70000 for 200000 times 8.64173036200009 torch.nn.Threshold(0.1, 20)(a), numel() == 700000 for 20000 times 9.456986365000375 dtype=torch.float torch.nn.Threshold(0.1, 20)(a), numel() == 70000 for 200000 times 5.431988049000211 torch.nn.Threshold(0.1, 20)(a), numel() == 700000 for 20000 times 3.446968590000324 dtype=torch.int16 torch.nn.Threshold(0.1, 20)(a), numel() == 70000 for 200000 times 3.743787463999979 torch.nn.Threshold(0.1, 20)(a), numel() == 700000 for 20000 times 1.823233144000369 dtype=torch.int32 torch.nn.Threshold(0.1, 20)(a), numel() == 70000 for 200000 times 5.42801834400052 torch.nn.Threshold(0.1, 20)(a), numel() == 700000 for 20000 times 3.4600211680008215 dtype=torch.int64 torch.nn.Threshold(0.1, 20)(a), numel() == 70000 for 200000 times 8.562551314000302 torch.nn.Threshold(0.1, 20)(a), numel() == 700000 for 20000 times 9.37924196699987 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/27155 Differential Revision: D17790768 Pulled By: VitalyFedyunin fbshipit-source-id: 3281eaff77ddddd658048c9e73824dd68c548591
Author
Parents
Loading