Enable per channel zero point. (#37619)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37619
This PR introduces changes to add per channel zero point.
Modifies kernels appripriately.
Some bug fixes in enabling per channel zero point.
Test Plan:
Via tests inside qnnpack, i.e., q8gemm-test, q8conv/dwconv test.
fully-conntected-test, convolution-test.
Imported from OSS
Differential Revision: D21339044
fbshipit-source-id: fb69488b2b04da109c69f3dd1e8a285babf2863d