[Quant][core][gpu] Implemented support for bias in quantized conv operator in cudnn (#73035)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73035
This PR extends https://github.com/pytorch/pytorch/pull/70622 which neglected the bias term. Here, the bias term is implemented using the support provided by cudnn's frontend API.
The support is implemented using optional variables and arguments.
We may wish to explore overloading raw_cudnn_convolution_forward_out
in the future as opposed to using optional variables, as that may be more performant.
Testing:
uncomment
```
unittest.skip("Local only - currently the qconv2d_cudnn op is bulid "
"with USE_EXPERIMENTAL_CUDNN_V8_API, we can enable the test "
"after it is built by default")
```
above test_qconv2d_cudnn in test/quantization/core/test_quantized_op.py.
Then execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
Differential Revision:
D34314535
D34314535
Test Plan: Imported from OSS
Reviewed By: jerryzh168
Pulled By: dzdang
fbshipit-source-id: c18f1b638db2cd0253e65dd0da71d72bff6759e2
(cherry picked from commit ae3d437a6ae15d1bac100211e81b10358b071131)