Changes to enable per channel requant. (#37620)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37620
Now channel wise quantization is supported for linear/conv.
Depthwise conv are still pending.
Tests are altered to generate per channel zero points and requant
scales.
All the kernels are fixed appropritately.
Added per_channel member to conv_param structure.
And replicated conv tests to exercise per_channel conv.
This was not strictly needed since conv kernels were changed
such that they did per channel anyway. When per channels is not needed
zp and scale were same across channels. This was to minimize code
duplicaiton as the perf impact is estimated (to be measured though) to
be low.
However this is not likely the case for depthwise convs. Thus they will
have separate kernels, which required us to introduce per_channel member
to conv_param structure, to know which kernels to apply for depthwise.
Ensuing modifications were to keep everything in
sync for both regular conv and depthwise so that we dont have caveat
when reading the code, that why does depthwise have separate test for
per channel and non-depthwise conv does not.
Test Plan:
Via tests inside qnnpack, i.e., q8gemm-test, q8conv/dwconv test.
fully-conntected-test, convolution-test.
Imported from OSS
Differential Revision: D21339041
fbshipit-source-id: 1b8fbd7fbd0fe0582a43996147171567b126d948