[quant][core][gpu] Implemented max pooling 2D using cudnn (#74673)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74673
Quantized 2D max pooling was implemented using cudnn. A corresponding
test case was also added. The operator requires cudnn version 8.3.3
or higher. v7 APIs (https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnPoolingForward)
are currently used as there are currently no v8
APIs for pooling. Note the current implementation does not support dilated pooling
as it is not supported in cudnn.
Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_max_pool2d_cudnn
```
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_max_pool2d_cudnn
```
Differential Revision:
D35135200
D35135200
Reviewed By: jerryzh168
Pulled By: dzdang
fbshipit-source-id: 199991031e0419e13578a1d4abbe17dd2ed98f66
(cherry picked from commit d0390a642cb3b9fba54e99a31d4759247706c5d9)