pytorch
a66af1fe - [quant][pt2e] Add early prototype top level quantize_pt2e APIs (#90802)

Commit

2 years ago

[quant][pt2e] Add early prototype top level quantize_pt2e APIs (#90802) Summary: This PR introduces the top level APIs for quantization support in PyTorch 2.0 Export stack * torch.ao.quantization.quantize_pt2e.prepare_pt2e Takes a model that is captured by the PyTorch 2.0 export (torchdynamo full graph mode) and prepares the model for calibration for post training quantization * torch.ao.quantization.quantize_pt2e.convert_pt2e Takes a calibrated model and converts that to a reference quantized model that can be lowered later to quantized operator libraries or delegation modules Also added a backend config for the qnnpack_pt2e backend: * torch.ao.quantization.backend_config.get_qnnpack_pt2e_backend_config Note: everything related to quantize_pt2e are experimental (prototype), and we don't have any bc guarantees Test Plan: python test/test_quantization.py TestQuantizePT2EModels Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/90802 Approved by: https://github.com/qihqi

Author

jerryzh168

Committer

pytorchmergebot

Parents

201c36d8

pytorch a66af1fe - [quant][pt2e] Add early prototype top level quantize_pt2e APIs (#90802)

pytorch
a66af1fe - [quant][pt2e] Add early prototype top level quantize_pt2e APIs (#90802)