onnxruntime
Enabling S8S8 and S8U8 handling in QGemm for AVX2 and AVX-VNNI
#21123
Closed

Enabling S8S8 and S8U8 handling in QGemm for AVX2 and AVX-VNNI #21123

mguynn-intc wants to merge 4 commits into microsoft:main from mguynn-intc:avx-vnni-int8
mguynn-intc
mguynn-intc313 days ago

Description

Implementation of sign flipping in QGemm CopyPackA to enable S8S8 and S8U8 handling in AVX2 and AVX-VNNI.
Added dispatching for S8S8 and S8U8 variants, defaulting to C++ implementation if AVX2 is not present.
Added unit testing triggers for S8S8 and S8U8.

Motivation and Context

QGemm kernel expects data in U8S8 form to utilize AVX-VNNI dot product instructions and the corresponding performance benefits.
Existing code can sign-flip the B matrix from unsigned to signed to allow U8U8 data to use this U8S8 VNNI instruction.
This code enables sign flipping in the A matrix to also allow S8S8 and S8U8 models to be translated into U8S8 form and use the VNNI instructions.
This change will enable models of any int8 data format to be handled by onnxruntime and see the same performance benefits.

mguynn-intc mguynn-intc requested a review 313 days ago
yihonglyu
yihonglyu commented on 2024-06-26
Conversation is marked as resolved
Show resolved
onnxruntime/test/mlas/unittest/test_qgemm.cpp
95 count += QgemmShortExecuteTest<int8_t, uint8_t, int32_t, true, false>::RegisterShortExecuteTests();
96 }
97 } catch (const std::invalid_argument& e) {
98
(void)e;
yihonglyu307 days ago

Is the exception introduced by your PR?

mguynn-intc307 days ago

Not directly.
The function MlasGemmQuantGetDispatch throws the error if the platform does not support the target signed/unsigned combination. Original code seems to fallback to C++ implementation by default for U8X8 or X8S8. When I added the new dispatchers I had originally set them to nullptr, then had to add these try/catch blocks when testing different platforms that couldn't support S8S8/S8U8.
I was unsure whether Microsoft wants all implementation routes to fallback to default C++ implementation if no other option is available. At some point I changed my dispatchers to be initialized to the default implementation so there would always be something functional. I believe with this initialization, it is unnecessary to have the try/catch blocks as it will never be null, but I wanted to get the intention clarified.

After reviewing some of the automated build results, I see that exceptions are not desired/enabled for certain platforms, so I will leave the dispatchers initialized to the default C++ method and remove the exception handling.

yihonglyu
yihonglyu307 days ago

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline

azure-pipelines
azure-pipelines307 days ago
Azure Pipelines successfully started running 9 pipeline(s).
mguynn-intc Implementation of sign flipping in QGemm CopyPackA to enable S8S8 and…
72e49b72
mguynn-intc Removed exception handling in unit tests
a032de7d
mguynn-intc Fixed typo in unit test
40e37838
mguynn-intc mguynn-intc force pushed from e588e4e7 to 40e37838 303 days ago
mguynn-intc Moved S8S8 and S8U8 support to AVX-VNNI only. AVX2 will keep default …
ae16d591
mguynn-intc mguynn-intc requested a review from yihonglyu yihonglyu 295 days ago
mguynn-intc
mguynn-intc288 days ago

@yihonglyu Can you please rerun the Azure Pipeline tests?

Also, the lint failures are spellcheck warnings from code outside my commits. Should I do anything to fix those?

georgen117
georgen117285 days ago

@yihonglyu Could you please rerun the Azure Pipeline tests?

mguynn-intc mguynn-intc closed this 272 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone