pytorch
d0bd8a3a - Migrate `sin` and `sin_` from the TH to Aten (CUDA) (#28237)

Commit
5 years ago
Migrate `sin` and `sin_` from the TH to Aten (CUDA) (#28237) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28237 Benchmark (RHEL 7, gcc 8.3.1, P1000): ```python import timeit for n, t in [(10_000, 20000), (100_000, 20000)]: for dtype in ('torch.half', 'torch.float', 'torch.double'): print(f'torch.sin(a) a.numel() == {n} for {t} times {dtype}') print(timeit.timeit(f'torch.sin(a); torch.cuda.synchronize()', setup=f'import torch; a=torch.arange({n}, dtype={dtype}, device="cuda")', number=t)) ``` Before: ``` torch.sin(a) a.numel() == 10000 for 20000 times torch.half 0.4649172620011086 torch.sin(a) a.numel() == 10000 for 20000 times torch.float 0.4616892600006395 torch.sin(a) a.numel() == 10000 for 20000 times torch.double 0.5166665920005471 torch.sin(a) a.numel() == 100000 for 20000 times torch.half 0.5376560490003612 torch.sin(a) a.numel() == 100000 for 20000 times torch.float 0.6207812359989475 torch.sin(a) a.numel() == 100000 for 20000 times torch.double 1.873208982999131 ``` After: ``` torch.sin(a) a.numel() == 10000 for 20000 times torch.half 0.4796977340010926 torch.sin(a) a.numel() == 10000 for 20000 times torch.float 0.48329569199995603 torch.sin(a) a.numel() == 10000 for 20000 times torch.double 0.5380683220009814 torch.sin(a) a.numel() == 100000 for 20000 times torch.half 0.5299932739999349 torch.sin(a) a.numel() == 100000 for 20000 times torch.float 0.6144487999990815 torch.sin(a) a.numel() == 100000 for 20000 times torch.double 1.8838113630008593 ``` Close #24627 Test Plan: Imported from OSS Differential Revision: D18089072 Pulled By: VitalyFedyunin fbshipit-source-id: 4824804960309fe7fdb16073d021388704986993
Author
Parents
Loading