onnxruntime
9a993c37 - [CPU] Add 8bit support to matmulnbits quantizer (#24384)

Commit
251 days ago
[CPU] Add 8bit support to matmulnbits quantizer (#24384) ### Description Add 8bit support to matmulnbits quantizer. matmul_4bits_quantizer now can quantize a const B in a MatMul to 8bits initializer. ### Motivation and Context MatMul4Bits has accuracy issue for phi-4 model used for foundry local. The early prototype showed >= 6bits can fix the issue. To mitigate the issue as soon as possible, add 8bit support to MatMulNBits.
Author
Parents
Loading