pytorch
926bb5d6 - changed launch bounds, unrolled for loop for grid sampler 2d fwd and bwd (#60405)

Commit
4 years ago
changed launch bounds, unrolled for loop for grid sampler 2d fwd and bwd (#60405) Summary: Changed launch bounds for grid sampler 2d fwd and bwd from 1024 to 256, added loop unrolling to fix register spilling into local memory. Timing Data: (using Nvidia Titan-V) Interpolation mode 2, padding 0, align corners False ![GridSampler2dTimingData](https://user-images.githubusercontent.com/22803332/122830305-01fd2d80-d29d-11eb-9cd3-7da533a03f33.PNG) Pull Request resolved: https://github.com/pytorch/pytorch/pull/60405 Reviewed By: albanD Differential Revision: D29288075 Pulled By: ngimel fbshipit-source-id: 5e060f0c2d1cc0a3086718e6be263413dfa29689
Author
Parents
Loading