pytorch
381e7259 - [quant][core][gpu][bux fix] Added clone and contiguous() to broadcasted_bias tensor in quantized cudnn linear op

Commit
3 years ago
[quant][core][gpu][bux fix] Added clone and contiguous() to broadcasted_bias tensor in quantized cudnn linear op Summary: The previous implementation for broadcasted_bias in quantized cudnn linear op has 2 issues. 1) broadcasted_bias is a view of the the input bias tensor. This is not desired as any modifications to broadcasted_bias is also done to the input bias. To remedy this, we clone the input bias tensor. 2) Calling broadcast_to doesn't affect the storage, which is problematic for the cudnn operations. We need to create a fully broadcasted tensor, rather than a view (which is what's returned by broadcast_to). To remedy this, we call contiguous(). Test plan: python test/test_quantization.py -k test_linear_cudnn Pull Request resolved: https://github.com/pytorch/pytorch/pull/75944 Approved by: https://github.com/jerryzh168
Author
Committer
Parents
Loading