onnxruntime
[QNN EP] Initial INT4 support
#21171
Merged

[QNN EP] Initial INT4 support #21171

adrianlizarraga merged 61 commits into main from adrianl/qnn-int4
adrianlizarraga
adrianlizarraga Update include/framework/ with int4
40da6791
adrianlizarraga Update onnxruntime_c_api.h with int4 type
e3e8a6bd
adrianlizarraga Update cpu_contrib_kernels.cc with int4 Q/DQ
5e01e0f6
adrianlizarraga Update framework/data_types.cc with int4 types
44e0e023
adrianlizarraga Update onnxruntime map type info with int4 types
ce03eb22
adrianlizarraga Update Tensor methods to calc int4 tensor data sizze
e1590078
adrianlizarraga Update function to map tensor_proto int4 to onnxruntime enum
46c3d0d6
adrianlizarraga Update tensorprotoutils to handle int4 protobufs
583dae1e
adrianlizarraga Add functions to map Int4x2 to an onnxruntime tensor type enum
0009a47c
adrianlizarraga Update com.microsoft.DequantizeLinear schema to support int4 types fo…
d11a3d49
adrianlizarraga Add option to disable int4 type in Conv and MatMul qdq node group sel…
208c4037
adrianlizarraga Add DequantizeLinear with int4 support (missing block quant)
f91ae697
adrianlizarraga update transpose helper to support int4
7323793a
adrianlizarraga Update provider bridge with int4 apis
8c79905a
adrianlizarraga Update quantizer tool with int4
fc695caa
adrianlizarraga Remove duplicate enum
eeacb78b
adrianlizarraga Remove MatMulSelector constructor arg
6f9da045
adrianlizarraga Remove unnecessary explicit template instantiation
7e8c4588
adrianlizarraga Add static_cast
3b7ed5f3
adrianlizarraga Add temporary CPU EP Int4 test (qdq conv)
c7086a5e
adrianlizarraga Update operator docs
34dfa171
adrianlizarraga Update testing version of tensorprotoutils with int4 helpers
e7bec9cf
adrianlizarraga Run lintrunner
cd8912e0
adrianlizarraga Fix api to create int4 ort value
4cf3a751
adrianlizarraga Wrap long lines in tensorprotoutils
ca785c2a
adrianlizarraga Add operator unit tests for Dequant int4/uint4
f87e785c
adrianlizarraga Remove comments
d028f2f7
adrianlizarraga Add QuantizeLinear int4 impl
24cc6172
adrianlizarraga Update operator docs
f35b09ef
adrianlizarraga Disable potentially bugged onnx tests
e33f198e
adrianlizarraga Add TODO username
10f28aa7
adrianlizarraga Fix warning as error and clean up
de1ded4c
adrianlizarraga Merge branch 'main' into adrianl/dq-transpose-int4
378718ea
adrianlizarraga Mlas kernels to quantize int4 (not blocked). Missing powerpc
746312b8
adrianlizarraga branchless update of 4-bit element
2935f797
adrianlizarraga more branchless update of int4 lane
807537cd
adrianlizarraga Fix cast warning as error
a36a128d
adrianlizarraga Remove decrement of N
bc445571
adrianlizarraga Clean up Int4x2 class
f40992dc
adrianlizarraga Github linter fixes
d0e17e21
adrianlizarraga Remove temporary unittest
6568d48a
adrianlizarraga Case statement missing :
b8d5869f
adrianlizarraga Merge branch 'adrianl/dq-transpose-int4' into adrianl/qnn-int4
80dadc5c
adrianlizarraga Ported over INT4 work for qnn ep
b01f52c4
adrianlizarraga Add debug API to qnn's get_qnn_qdq_config() for int4
35d3fa5e
adrianlizarraga lintrunner
a6a33976
adrianlizarraga Fix use of int4 qdq options
1a528f11
adrianlizarraga Merge main and fix conflicts
58044f93
adrianlizarraga Remove unnecessary changes
168f4e04
adrianlizarraga Workaround that dels calibrator.model
7ede88b9
adrianlizarraga Merge branch 'main' into adrianl/qnn-int4
ca6e5f92
adrianlizarraga Add Conv int4 per-channel test
4b2af75e
adrianlizarraga Merge main
578d47da
adrianlizarraga Clean up
9770a97b
adrianlizarraga Merge branch 'main' into adrianl/qnn-int4
eb9687a4
adrianlizarraga Run lintrunner
92efb6d5
adrianlizarraga Add more comments
05fee606
adrianlizarraga Merge branch 'main' into adrianl/qnn-int4
def223dd
adrianlizarraga Fix line lengths and TODO linter warnings
87d84399
adrianlizarraga adrianlizarraga added ep:QNN
adrianlizarraga adrianlizarraga marked this pull request as ready for review 1 year ago
adrianlizarraga adrianlizarraga requested a review from HectorSVC HectorSVC 1 year ago
adrianlizarraga adrianlizarraga requested a review from jywu-mysoft jywu-mysoft 1 year ago
adrianlizarraga Merge branch 'main' into adrianl/qnn-int4
618bdf96
adrianlizarraga Do not run QNN Int4 model test on windows x64
2d163a4e
HectorSVC
HectorSVC approved these changes on 2024-07-09
adrianlizarraga adrianlizarraga merged 5753f8da into main 1 year ago
adrianlizarraga adrianlizarraga deleted the adrianl/qnn-int4 branch 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone