Add dynamic quantized Linear op in PyTorch (#23464)
Summary:
As suggested in https://github.com/pytorch/pytorch/pull/22891, we will add an overload for torch.fbgemm_linear_int8_weight (dynamic quantized version of linear function) that takes PackedLinearWeight as input and is pretty much the same in signature as regular aten::linear.
The previous Diff D16381552 is reverted because `quantize_linear` expects the scale to be `float`, and the zero_point to be `int`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23464
ghstack-source-id: 88257231
Differential Revision: D16527741
fbshipit-source-id: 66585f668c6e623c50514eb11633bb711d8767f2