[quant][fx][pt2e][refactor] Refactor prepare.py for upcoming quantize_pt2e changes (#92641)
Summary:
Changes node.meta["target_dtype_info"] to store observer/fake_quant constructors instead of (dtype, is_dynamic),
so that in the future user can provide configure this by themselves, follow up refactors:
(1). generalized structure for "target_dtype_info": right now, we have "input_act_obs_or_fq_ctr", "weight_obs_or_fq_ctr", "bias_obs_or_fq_ctr", "output_obs_or_fq_ctr"
this works OK for current use cases, and users are using a different config to specify which input is weight and which input is bias, to generalize it
we should just expose an api that allow users to specify either a dictionary from input_index to obs_or_fq_ctr, and output_index to obs_or_fq_ctr, e.g.
e.g. out1, (out2, out3) = op(arg0, (arg1, arg2))
"input_act_obs_or_fq_ctr" = {0: obs1, 1: obs2}
"output_act_obs_or_fq_ctr" = {0: obs3, 1: obs4}
note that this would not allow configuring obs/fq for nested structures
or have a config that mimics the structure of arguments and output, e.g. out1, (out2, out3) = op(arg0, (arg1, arg2)), we can have
"input_act_obs_or_fq_ctr" = (obs1, (obs2, obs3))
"output_act_obs_or_fq_ctr" = (obs4, (obs5, obs6))
(2). use these observer/fq directly for inserting observers instead of using qconfig
(3). clean up the TODOs in the code base
Test Plan:
python test/test_quantization.py TestQuantizeFx
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92641
Approved by: https://github.com/jcaip