onnxruntime
26b39641 - Block-wise 4b quantization matmul operator change (#18172)

Commit

2 years ago

Block-wise 4b quantization matmul operator change (#18172) ### Description Replace block-wise 4b quantization implementation ### Motivation and Context In https://github.com/microsoft/onnxruntime/pull/18101 we have an augmented block-wise 4b quantization interface and implementation. Here we use this new implementation in onnxruntime contrib ops --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>

References

#18172 - Block-wise 4b quantization matmul operator change

Author

chenfucn

Parents

2ec1f94b

onnxruntime 26b39641 - Block-wise 4b quantization matmul operator change (#18172)

onnxruntime
26b39641 - Block-wise 4b quantization matmul operator change (#18172)