pytorch
0585daae - fixed launch bounds for gathertopk kernel (#60314)

Commit
4 years ago
fixed launch bounds for gathertopk kernel (#60314) Summary: Changed launch bounds for gatherTopK kernel to fix register spilling into local memory. Comparison (Nvidia Titan-V GPU): Args: Input size as below, k=32, dim=None ![TopKTimingData](https://user-images.githubusercontent.com/22803332/122624922-46978780-d057-11eb-9b52-d5786da432c0.PNG) Pull Request resolved: https://github.com/pytorch/pytorch/pull/60314 Reviewed By: mruberry Differential Revision: D29267789 Pulled By: ngimel fbshipit-source-id: 4056efb2e44e5527786167af66a127504980a3af
Author
Parents
Loading