onnxruntime
9a993c37 - [CPU] Add 8bit support to matmulnbits quantizer (#24384)

Commit

287 days ago

[CPU] Add 8bit support to matmulnbits quantizer (#24384) ### Description Add 8bit support to matmulnbits quantizer. matmul_4bits_quantizer now can quantize a const B in a MatMul to 8bits initializer. ### Motivation and Context MatMul4Bits has accuracy issue for phi-4 model used for foundry local. The early prototype showed >= 6bits can fix the issue. To mitigate the issue as soon as possible, add 8bit support to MatMulNBits.

References

#24384 - [CPU] Add 8bit support to matmulnbits quantizer

Author

fajin-corp

Parents

90c263f4

onnxruntime 9a993c37 - [CPU] Add 8bit support to matmulnbits quantizer (#24384)

onnxruntime
9a993c37 - [CPU] Add 8bit support to matmulnbits quantizer (#24384)