pytorch
7e9cc4df - Migrate `cos` and `cos_` from TH to ATen (CUDA) (#36653)

Commit
5 years ago
Migrate `cos` and `cos_` from TH to ATen (CUDA) (#36653) Summary: Benchmark with same build settings on same system. Closes https://github.com/pytorch/pytorch/issues/24545 gcc : version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) CUDA : 10.1 GPU : 1050ti ```python import timeit for n, t in [(10_000, 20000), (100_000, 20000)]: for dtype in ('torch.half', 'torch.float', 'torch.double'): print(f'torch.cos(a) a.numel() == {n} for {t} times {dtype}') print(timeit.timeit(f'torch.cos(a); torch.cuda.synchronize()', setup=f'import torch; a=torch.arange({n}, dtype={dtype}, device="cuda")', number=t)) ``` Before: ``` torch.cos(a) a.numel() == 10000 for 20000 times torch.half 0.2797315450006863 torch.cos(a) a.numel() == 10000 for 20000 times torch.float 0.283109110998339 torch.cos(a) a.numel() == 10000 for 20000 times torch.double 0.3648525129974587 torch.cos(a) a.numel() == 100000 for 20000 times torch.half 0.34239949499897193 torch.cos(a) a.numel() == 100000 for 20000 times torch.float 0.33680364199972246 torch.cos(a) a.numel() == 100000 for 20000 times torch.double 1.0512770260102116 ``` After: ``` torch.cos(a) a.numel() == 10000 for 20000 times torch.half 0.285825898999974 torch.cos(a) a.numel() == 10000 for 20000 times torch.float 0.2781305120001889 torch.cos(a) a.numel() == 10000 for 20000 times torch.double 0.34188826099989456 torch.cos(a) a.numel() == 100000 for 20000 times torch.half 0.29040409300023384 torch.cos(a) a.numel() == 100000 for 20000 times torch.float 0.28678944200009937 torch.cos(a) a.numel() == 100000 for 20000 times torch.double 1.065477349000048 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/36653 Differential Revision: D21164675 Pulled By: VitalyFedyunin fbshipit-source-id: 5dd5d3af47c2a5527e1f4ab7669c2ed9a2293cee
Author
Parents
Loading