[quant][fx] Add pass in convert to fold quant-dequant sequence (#54860)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54860
Currently we insert a quantize_per_tensor op when we encounter the quantizable input,
so if it has multiple uses and not all are quantizable then we need to add a dequantize op
before these ops.
In this pass - For a sequence of quantize_per_tensor - dequantize, we combine them
since it is a no-op.
[internal only][pyper]
Before this change we had redundant dequantize nodes in the graph
Example 1x inline_cvr graph https://www.internalfb.com/intern/everpaste/?handle=GODBxAlUMzGHD6 (https://github.com/pytorch/pytorch/commit/98143776f5637629258b3537d4d36da31a42cb91)MSACpHKKu9qjorbsIXAAAz
FC layers -> 37
quantize_per_tensor -> 30
dequantize -> 49
After this change
https://www.internalfb.com/intern/everpaste/?handle=GAl0uQnOlDNmpLoSAB-GZqRxu9wMbsIXAAAz
FC layers -> 37
quantize_per_tensor -> 30
dequantize -> 39
We remove extra 10 dequantize nodes in the graph.
Test Plan:
python test/test_quantization.py test_fold_quant_dequant
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D27390506
fbshipit-source-id: 56e6fb8496171246eccf4bd45eb8bebd87fcb740