[quant][core][gpu][bug-fix] Added additional caching support in quantized cudnn add_relu op
Summary:
Previous caching strategy for quantized cudnn add_relu operator was insufficient
as it did not properly record all the necesary information. This PR adds several
items to the CacheKey (e.g., input sizes, input dimensions, etc..) that enables
proper caching
Test plan:
```
python test/test_quantization.py -k test_qadd_relu_cudnn
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75772
Approved by: https://github.com/jerryzh168