[quant][gpu] Adding quantized conv operator in cudnn (#70622)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70622
This PR is the initial PR to add eager mode quantized GPU operator support, we'll start
with convolution, following cudnn fp32 Conv code and the example cudnn frontend code
https://github.com/pytorch/pytorch/pull/51390
https://github.com/NVIDIA/cudnn-frontend/blob/main/samples/fusion_sample.cpp#L557
Test Plan:
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D33409155
fbshipit-source-id: cb5183d274993fcd2c3ab6de8ae022baa9f89f7f
(cherry picked from commit 4fde5559dee2a28907b09f96bc5a8dd259148d2e)