[QNN EP] Initial INT4 support #21171
Update include/framework/ with int4
40da6791
Update onnxruntime_c_api.h with int4 type
e3e8a6bd
Update cpu_contrib_kernels.cc with int4 Q/DQ
5e01e0f6
Update framework/data_types.cc with int4 types
44e0e023
Update onnxruntime map type info with int4 types
ce03eb22
Update Tensor methods to calc int4 tensor data sizze
e1590078
Update function to map tensor_proto int4 to onnxruntime enum
46c3d0d6
Update tensorprotoutils to handle int4 protobufs
583dae1e
Add functions to map Int4x2 to an onnxruntime tensor type enum
0009a47c
Update com.microsoft.DequantizeLinear schema to support int4 types fo…
d11a3d49
Add option to disable int4 type in Conv and MatMul qdq node group sel…
208c4037
Add DequantizeLinear with int4 support (missing block quant)
f91ae697
update transpose helper to support int4
7323793a
Update provider bridge with int4 apis
8c79905a
Update quantizer tool with int4
fc695caa
Remove duplicate enum
eeacb78b
Remove MatMulSelector constructor arg
6f9da045
Remove unnecessary explicit template instantiation
7e8c4588
Add static_cast
3b7ed5f3
Add temporary CPU EP Int4 test (qdq conv)
c7086a5e
Update operator docs
34dfa171
Update testing version of tensorprotoutils with int4 helpers
e7bec9cf
Run lintrunner
cd8912e0
Fix api to create int4 ort value
4cf3a751
Wrap long lines in tensorprotoutils
ca785c2a
Add operator unit tests for Dequant int4/uint4
f87e785c
Remove comments
d028f2f7
Add QuantizeLinear int4 impl
24cc6172
Update operator docs
f35b09ef
Disable potentially bugged onnx tests
e33f198e
Add TODO username
10f28aa7
Fix warning as error and clean up
de1ded4c
Merge branch 'main' into adrianl/dq-transpose-int4
378718ea
Mlas kernels to quantize int4 (not blocked). Missing powerpc
746312b8
branchless update of 4-bit element
2935f797
more branchless update of int4 lane
807537cd
Fix cast warning as error
a36a128d
Remove decrement of N
bc445571
Clean up Int4x2 class
f40992dc
Github linter fixes
d0e17e21
Remove temporary unittest
6568d48a
Case statement missing :
b8d5869f
Merge branch 'adrianl/dq-transpose-int4' into adrianl/qnn-int4
80dadc5c
Ported over INT4 work for qnn ep
b01f52c4
Add debug API to qnn's get_qnn_qdq_config() for int4
35d3fa5e
lintrunner
a6a33976
Fix use of int4 qdq options
1a528f11
Merge main and fix conflicts
58044f93
Remove unnecessary changes
168f4e04
Workaround that dels calibrator.model
7ede88b9
Merge branch 'main' into adrianl/qnn-int4
ca6e5f92
Add Conv int4 per-channel test
4b2af75e
Merge main
578d47da
Clean up
9770a97b
Merge branch 'main' into adrianl/qnn-int4
eb9687a4
Run lintrunner
92efb6d5
Add more comments
05fee606
Merge branch 'main' into adrianl/qnn-int4
def223dd
Fix line lengths and TODO linter warnings
87d84399
Merge branch 'main' into adrianl/qnn-int4
618bdf96
Do not run QNN Int4 model test on windows x64
2d163a4e
HectorSVC
approved these changes
on 2024-07-09
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub