onnxruntime
26b39641 - Block-wise 4b quantization matmul operator change (#18172)

Commit
2 years ago
Block-wise 4b quantization matmul operator change (#18172) ### Description Replace block-wise 4b quantization implementation ### Motivation and Context In https://github.com/microsoft/onnxruntime/pull/18101 we have an augmented block-wise 4b quantization interface and implementation. Here we use this new implementation in onnxruntime contrib ops --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Author
Parents
Loading