pytorch
0dc40474 - Migrate glu from the THC to ATen (CUDA) (#61153)

Commit View On GitHub

Commit

3 years ago

Migrate glu from the THC to ATen (CUDA) (#61153) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61153 Fixes gh-24571, fixes gh-24572 Closes gh-39586, closes gh-39586 Benchmarks ---------- The benchmarks were run with nvprof calling the operator in a loop. It shows reliable improvements for large tensors, but the TH implementation seems to fair better for smaller tensors. For sufficiently large tensors, the ATen implementation does win though. | Shape | Dim | Master Forward (us) | This PR Forward (us) | Master Backward (us) | This PR Backward (us) | |-------------:|-----|:-------------------:|:--------------------:|:--------------------:|:---------------------:| | 128, 1000 | 0 | 2.4770 | 2.0820 | 3.0440 | 3.4680 | | | 1 | 2.7060 | 4.4850 | 3.3380 | 3.6250 | | 128, 10000 | 0 | 26.531 | 21.366 | 38.083 | 34.623 | | | 1 | 27.680 | 30.465 | 38.943 | 35.204 | | 128, 100000 | 0 | 292.09 | 219.56 | 355.57 | 324.49 | | | 1 | 260.43 | 243.08 | 332.25 | 323.37 | | 128, 1000000 | 0 | 2475.7 | 1874.6 | 3810.1 | 3215.7 | | | 1 | 2586.3 | 2380.9 | 3349.9 | 3207.8 | Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D29538093 Pulled By: ngimel fbshipit-source-id: 1f66b45ec7c46fb8e680b50110a5fde6fe7faab7

Author

peterbell10

Committer

facebook-github-bot

Parents

7a4ffbd1

pytorch 0dc40474 - Migrate glu from the THC to ATen (CUDA) (#61153)

Commit

pytorch
0dc40474 - Migrate glu from the THC to ATen (CUDA) (#61153)