[quant][core][gpu][improvement] Modified quantized cudnn linear caching
Summary plan:
The previous CacheKey definition for cudnn quantized linear operator was insufficient.
This PR expands upon the definition by adding additional necessary parameters in CacheKey,
(e.g., input, weight size, kReluFused)
Test plan:
```
python test/test_quantization.py TestQuantizedLinear.test_qlinear_cudnn
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75447
Approved by: https://github.com/jerryzh168