/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline
@yihonglyu Can you please rerun the Azure Pipeline tests?
Also, the lint failures are spellcheck warnings from code outside my commits. Should I do anything to fix those?
@yihonglyu Could you please rerun the Azure Pipeline tests?
Login to write a write a comment.
Description
Implementation of sign flipping in QGemm CopyPackA to enable S8S8 and S8U8 handling in AVX2 and AVX-VNNI.
Added dispatching for S8S8 and S8U8 variants, defaulting to C++ implementation if AVX2 is not present.
Added unit testing triggers for S8S8 and S8U8.
Motivation and Context
QGemm kernel expects data in U8S8 form to utilize AVX-VNNI dot product instructions and the corresponding performance benefits.
Existing code can sign-flip the B matrix from unsigned to signed to allow U8U8 data to use this U8S8 VNNI instruction.
This code enables sign flipping in the A matrix to also allow S8S8 and S8U8 models to be translated into U8S8 form and use the VNNI instructions.
This change will enable models of any int8 data format to be handled by onnxruntime and see the same performance benefits.