pytorch
604a414b - [quant][pt2] Fix convert in Conv + BN QAT fusion (#102224)

Commit View On GitHub

Commit

1 year ago

[quant][pt2] Fix convert in Conv + BN QAT fusion (#102224) Summary: Previously, the test for the convert flow in Conv + BN QAT fusion was not enabled by mistake. However, reenabling this test uncovered several bugs: (1) The replaced nodes returned by subgraph rewriter were not handled correctly. This is because a recent change in the subgraph rewriter (#100556) fixed only the prepare case but not the convert case. This commit brings this fix to the convert case as well and deduplicates some code between the two cases. (2) When folding BN into conv, we used the wrong arg index to get the BN eps value. This resulted in an incorrect conv weight. (3) In FX, we currently do a hack for weighted modules where we observe the weights once in convert in order to ensure we get the right shapes for these weight observers. This caused the numerics to diverge between PT2 and FX. This commit fixes this by skipping this unnecessary hack for `_convert_to_reference_decomposed_fx`. (4) Per channel support was simply missing. This commit adds support for this by matching the quantize_per_channel and dequantize_per_channel ops in addition to the existing ones. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_qat_conv_bn_numerics Reviewed By: jerryzh168 Differential Revision: D46097783 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102224 Approved by: https://github.com/jerryzh168

References

gh/bdhirsh/420/base

gh/titaiwangms/26/base

Author

andrewor14

Committer

pytorchmergebot

Parents

4bb2b65e

pytorch 604a414b - [quant][pt2] Fix convert in Conv + BN QAT fusion (#102224)

Commit

pytorch
604a414b - [quant][pt2] Fix convert in Conv + BN QAT fusion (#102224)