pytorch
7ddf212f - [quant][fx] Fully align convert with the reference model design and simplify the implementation (#73863)

Commit

2 years ago

[quant][fx] Fully align convert with the reference model design and simplify the implementation (#73863) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73863 This PR fully aligns the convert function with the design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md and simplifies the implementation of convert function by always produce a reference quantized model (with reference patterns) first, and then lower the model to a quantized model that is runnable with PyTorch native backend (fbgemm/qnnpack). This PR makes the convert.py much easier to understand than the previous implementation, and we are able to remove majority of code in quantization_patterns.py as well (in followup PRs). Test Plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels ``` and other internal/oss regression tests Imported from OSS Reviewed By: andrewor14 Differential Revision: D34778506 fbshipit-source-id: 0678b66addf736039a8749b352f6f569caca962b (cherry picked from commit 33ec9caf23f3ab373d827117efbd9db0668b2437)

References

#74332 - Merge master into lazy_tensor_staging

Author

jerryzh168

Committer

pytorchmergebot

Parents

7070fe4d

pytorch 7ddf212f - [quant][fx] Fully align convert with the reference model design and simplify the implementation (#73863)

pytorch
7ddf212f - [quant][fx] Fully align convert with the reference model design and simplify the implementation (#73863)