pytorch
ef84bcfe - Convert floating-point constants to T in Bessel functions (#59416)

Commit
3 years ago
Convert floating-point constants to T in Bessel functions (#59416) Summary: If T is float, many of the computations are more expensive than expected. Compilers may be reluctant to optimize because they often lead to different outcome. Converting many constants to T before using them to clear any doubt. Benchmark: (Debian 11, no turbo, Release build, Intel(R) Xeon(R) E-2136 CPU @ 3.30GHz, gcc 10.2.1) ```python import timeit for dtype in ('torch.float',): for func in ('i0', 'i0e', 'i1', 'i1e'): for n, t in [(10_000, 10000), (100_000, 1000)]: print(f'torch.special.{func}(torch.arange(n, dtype=torch.float32)), n = {n} for {t} times, dtype={dtype}') print(timeit.timeit(f'torch.special.{func}(a)', setup=f'import torch; a = torch.arange({n}, dtype=torch.float32)', number=t)) ``` Before: ``` torch.special.i0(torch.arange(n, dtype=torch.float32)), n = 10000 for 10000 times, dtype=torch.float 1.539132010017056 torch.special.i0(torch.arange(n, dtype=torch.float32)), n = 100000 for 1000 times, dtype=torch.float 0.9613071230123751 torch.special.i0e(torch.arange(n, dtype=torch.float32)), n = 10000 for 10000 times, dtype=torch.float 4.32450835997588 torch.special.i0e(torch.arange(n, dtype=torch.float32)), n = 100000 for 1000 times, dtype=torch.float 1.5751779029960744 torch.special.i1(torch.arange(n, dtype=torch.float32)), n = 10000 for 10000 times, dtype=torch.float 1.0810036820184905 torch.special.i1(torch.arange(n, dtype=torch.float32)), n = 100000 for 1000 times, dtype=torch.float 0.5314770240220241 torch.special.i1e(torch.arange(n, dtype=torch.float32)), n = 10000 for 10000 times, dtype=torch.float 0.41711462699458934 torch.special.i1e(torch.arange(n, dtype=torch.float32)), n = 100000 for 1000 times, dtype=torch.float 0.1759720179834403 ``` After: ``` torch.special.i0(torch.arange(n, dtype=torch.float32)), n = 10000 for 10000 times, dtype=torch.float 1.337154256994836 torch.special.i0(torch.arange(n, dtype=torch.float32)), n = 100000 for 1000 times, dtype=torch.float 0.8640981369826477 torch.special.i0e(torch.arange(n, dtype=torch.float32)), n = 10000 for 10000 times, dtype=torch.float 4.308618158014724 torch.special.i0e(torch.arange(n, dtype=torch.float32)), n = 100000 for 1000 times, dtype=torch.float 1.5217605629877653 torch.special.i1(torch.arange(n, dtype=torch.float32)), n = 10000 for 10000 times, dtype=torch.float 0.9398589830088895 torch.special.i1(torch.arange(n, dtype=torch.float32)), n = 100000 for 1000 times, dtype=torch.float 0.4667845010117162 torch.special.i1e(torch.arange(n, dtype=torch.float32)), n = 10000 for 10000 times, dtype=torch.float 0.3658539849857334 torch.special.i1e(torch.arange(n, dtype=torch.float32)), n = 100000 for 1000 times, dtype=torch.float 0.15680673700990155 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/59416 Reviewed By: anjali411 Differential Revision: D29249897 Pulled By: mruberry fbshipit-source-id: c170e78f2ab47176ea95b8442c6279d7ec1d75c2
Author
Parents
Loading