[quant][graphmode] produce reference pattern for binary ops and then rewrite to quantized op (#72953)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72953
This PR makes BinaryOpQuantizeHandler to always produce reference patterns, and we have
some custom lowering pass to rewrite the reference qunatized patterns to quantized ops
it includes rewrite for
torch.ops.quantized.add, torch.ops.quantized.mul, torch.ops.quantized.matmul
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: gchanan
Differential Revision: D34292408
fbshipit-source-id: 9872a5098249bc77db15e9fb614416958e62b9b2
(cherry picked from commit dbdc61ee8b5dde2e54a34a370a3af887e5117398)