pytorch
83d848e1 - [Quant][Inductor] Enable lowering of dynamic qlinear for X86Inductor (#120605)

Commit View On GitHub

Commit

202 days ago

[Quant][Inductor] Enable lowering of dynamic qlinear for X86Inductor (#120605) **description** Enable lowering of dynamic qlinear for X86Inductor. The pattern is `choose_qparams -> getitem -> q -> dq -> linear`. We only fuse `dq -> linear` and get `choose_qparams -> getitem -> q -> onednn.qlinear_pointwise`. So, we treat it as dynamic quantization of activation + static quantized linear. The previous implementation of `onednn.qlinear_pointwise` is for the case where `x_scale` and `x_zp` are scalars. Since `choose_qparams` returns tensors, we added a variation `onednn.qlinear_pointwise.tensor` to support the case. This feature is targeting PyTorch 2.3 release. **Test plan** ``` python inductor/test_mkldnn_pattern_matcher.py -k test_dynamic_qlinear_cpu python inductor/test_mkldnn_pattern_matcher.py -k test_dynamic_qlinear_qat_cpu python inductor/test_cpu_cpp_wrapper.py -k test_dynamic_qlinear ``` **Performance before and after lowering `choose_qparam` to Inductor** Before - latency for shape (32, 32) = 0.151 ms latency for shape (128, 128) = 0.153 ms latency for shape (1024, 1024) = 0.247 ms After - latency for shape (32, 32) = 0.049 ms - latency for shape (128, 128) = 0.052 ms - latency for shape (1024, 1024) = 0.133 ms Test method: A module with a single Linear layer, dynamic-quantize, lower to X86Inductor Test env & config: Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz, single instance, single core, using Intel OpenMP and Tcmalloc Pull Request resolved: https://github.com/pytorch/pytorch/pull/120605 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5, https://github.com/jerryzh168

Author

Xia-Weiwen

Committer

pytorchmergebot

Parents

af5376c4

pytorch 83d848e1 - [Quant][Inductor] Enable lowering of dynamic qlinear for X86Inductor (#120605)

Commit

pytorch
83d848e1 - [Quant][Inductor] Enable lowering of dynamic qlinear for X86Inductor (#120605)