Updates to Scale and Zero Point Gradient Calculation (#42034)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42034
In this diff, scale and zero point gradient calculations are updated to correctly reflect the actual backpropagation equation (instead of `dScale * dX`, the near-final output should be `dScale * dY`; the same applies to zero point).
Test Plan:
To execute the unit tests for all affected learnable fake quantize modules and kernels, on a devvm, execute the following command:
`buck test //caffe2/test:quantization -- learnable`
To enable the `cuda` tests, execute the following command:
`buck test mode/dev-nosan //caffe2/test:quantization -- learnable`
Reviewed By: jerryzh168
Differential Revision: D22735668
fbshipit-source-id: 45c1e0fd38cbb2d8d5e60be4711e1e989e9743b4