fixed launch bounds for grid sampler 3d (#60385)
Summary:
Changed launch bounds for grid_sampler_3d from 1024 to 512 and grid_sampler_3d_backward from 1024 to 256.
Timing data (using Nvidia Titan-V):
![GridSampler3dTimingData](https://user-images.githubusercontent.com/22803332/122813457-d3c12300-d287-11eb-99c1-6572f539660f.PNG)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60385
Reviewed By: jbschlosser
Differential Revision: D29433741
Pulled By: ngimel
fbshipit-source-id: 7f475d0c2e854ae65dd0f1fb0167dfae7e506ec9