pytorch
ed30afd4 - Speed up torch.unique_consecutive() (#64835)

Commit
4 years ago
Speed up torch.unique_consecutive() (#64835) Summary: Fixes https://github.com/pytorch/pytorch/issues/62690 Like the way `unique_consecutive_cpu_template` implemented, this PR reimplements `_unique_dim_cpu_impl` to get better performance. Also, because the overhead of `unique_dim_consecutive_cpu` is quite large, directly call `unique_consecutive_cpu_template` when we know the given input is a 1d-array. ## Benchmark ### Script ```python import torch import time torch.manual_seed(0) t = torch.randint(500, (10000000, )) t = torch.sort(t)[0] start = time.time() uniques, inverse, counts = torch.unique_consecutive(t, dim=0, return_inverse=True, return_counts=True) end = time.time() print("torch.unique_consecutive(dim=0) time:", end - start) start = time.time() uniques2, inverse2, counts2 = torch.unique_consecutive(t, return_inverse=True, return_counts=True) end = time.time() print("torch.unique_consecutive() time:", end - start) t = torch.randint(500, (10000000, 2)) t = torch.sort(t)[0] start = time.time() uniques, inverse, counts = torch.unique_consecutive(t, dim=0, return_inverse=True, return_counts=True) end = time.time() print("torch.unique_consecutive(dim=0) time:", end - start) start = time.time() uniques, inverse, counts = torch.unique_consecutive(t, dim=1, return_inverse=True, return_counts=True) end = time.time() print("torch.unique_consecutive(dim=1) time:", end - start) ``` ### Before ``` torch.unique_consecutive(dim=0) time: 78.64345622062683 torch.unique_consecutive() time: 0.029544353485107422 torch.unique_consecutive(dim=0) time: 91.49796152114868 torch.unique_consecutive(dim=1) time: 0.30872368812561035 ``` ### After ``` torch.unique_consecutive(dim=0) time: 0.08256125450134277 torch.unique_consecutive() time: 0.08162403106689453 torch.unique_consecutive(dim=0) time: 35.58408498764038 torch.unique_consecutive(dim=1) time: 1.6258199214935303 ``` ## System Information ``` Collecting environment information... PyTorch version: 1.10.0a0+git7f1932e Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A OS: Ubuntu 20.04.3 LTS (x86_64) GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Clang version: 10.0.0-4ubuntu1 CMake version: version 3.16.3 Libc version: glibc-2.31 Python version: 3.8.10 (default, Jun 2 2021, 10:49:15) [GCC 9.4.0] (64-bit runtime) Python platform: Linux-5.11.0-34-generic-x86_64-with-glibc2.29 Is CUDA available: False CUDA runtime version: No CUDA GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A Versions of relevant libraries: [pip3] numpy==1.21.2 [pip3] torch==1.10.0a0+gitbe09195 [conda] Could not collect ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/64835 Reviewed By: jbschlosser Differential Revision: D30894906 Pulled By: ngimel fbshipit-source-id: 42ab76d638391ce6c4e589d9c71bdf7579310ad9
Author
Parents
Loading