[quant][core][gpu][feature] Implemented quantized conv1d cudnn op
Summary:
Previously, only quantized conv2d cudnn op has been implemented. This PR
implements the 1d variant. Because cuDNN does not have direct support
for conv1d, we have cast the 1d case to a 2d case by adding a dummy
weight dimension of 1 for the input and weight tensors. This is
analogous to how it was done for quantized cpu conv1d (e.g., see
`quantized/cpu/qconv.cpp`)
A corresponding test case was added in `test_quantized_op.py`. This
function should ideally be merged with `test_qconv1d` when cuDNN flags are
enabled and available in pytorch.
Test Plan:
```
python test/test_quantization.py -k test_qconv1d_cudnn
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77175
Approved by: https://github.com/jerryzh168