pytorch
ae79f95c - [quant][fx][pt2e][refactor] Refactor prepare.py for upcoming quantize_pt2e changes (#92641)

Commit
1 year ago
[quant][fx][pt2e][refactor] Refactor prepare.py for upcoming quantize_pt2e changes (#92641) Summary: Changes node.meta["target_dtype_info"] to store observer/fake_quant constructors instead of (dtype, is_dynamic), so that in the future user can provide configure this by themselves, follow up refactors: (1). generalized structure for "target_dtype_info": right now, we have "input_act_obs_or_fq_ctr", "weight_obs_or_fq_ctr", "bias_obs_or_fq_ctr", "output_obs_or_fq_ctr" this works OK for current use cases, and users are using a different config to specify which input is weight and which input is bias, to generalize it we should just expose an api that allow users to specify either a dictionary from input_index to obs_or_fq_ctr, and output_index to obs_or_fq_ctr, e.g. e.g. out1, (out2, out3) = op(arg0, (arg1, arg2)) "input_act_obs_or_fq_ctr" = {0: obs1, 1: obs2} "output_act_obs_or_fq_ctr" = {0: obs3, 1: obs4} note that this would not allow configuring obs/fq for nested structures or have a config that mimics the structure of arguments and output, e.g. out1, (out2, out3) = op(arg0, (arg1, arg2)), we can have "input_act_obs_or_fq_ctr" = (obs1, (obs2, obs3)) "output_act_obs_or_fq_ctr" = (obs4, (obs5, obs6)) (2). use these observer/fq directly for inserting observers instead of using qconfig (3). clean up the TODOs in the code base Test Plan: python test/test_quantization.py TestQuantizeFx Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/92641 Approved by: https://github.com/jcaip
Author
Committer
Parents
Loading