[Quant] Use input_qspec_map for weight quantization of linear (#107105)
Summary:
In prepararation for metadata porting diff, it is required that weight
quant annotation happens via edge quantization, i.e. input_qspec_map.
Reason: Metadata is ported via associating DQ node's metadata with its
consumer while associating Q node's metadata with its producer.
Furthermore, such porting must be qualified via user intent to see if
the consumder of DQ, or producer of Q, actually specified intent of
quantization
By making quantization annotation on linear node's weight via
input_qspec_map, we can enable associating DQ of [weight -> Q -> DQ],
with the linear module.
Test Plan:
CI
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: [D48488414](https://our.internmc.facebook.com/intern/diff/D48488414)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107105
Approved by: https://github.com/jerryzh168