fix AvgPool2d for 2^31-1 sized inputs, and get test_cuda_kernel_loop_… (#30771)
Summary:
…overflow_large to working state
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30771
Differential Revision: D18821529
Pulled By: ngimel
fbshipit-source-id: c5cbf56e686a2a3cfc7274dd96db37289dac7588