onnxruntime
02aa881a - [wasm] Optimize WASM relaxed simd MlasGemmQuantKernel (#25048)

Commit
279 days ago
[wasm] Optimize WASM relaxed simd MlasGemmQuantKernel (#25048) ### Description This change introduced a 6x8 QGEMM micro kernel for WASM relaxed SIMD build. ### Motivation and Context This change optimizes the performance of QGEMM on x64 devices with AVX-VNNI. | Mlas bench/RPL laptop/node v24.1.0 | baseline | opt | diff | |------------------------------------------------------------------------|----------|---------|------| | QGEMM/UnsignedANoPackB/M:384/N:1024/K:1024/Batch:1/Threads:4/real_time | 2452212 | 1708338 | 44% | | QGEMM/UnsignedANoPackB/M:384/N:1024/K:3072/Batch:1/Threads:4/real_time | 9053789 | 6395584 | 42% | | QGEMM/UnsignedANoPackB/M:384/N:1024/K:4096/Batch:1/Threads:4/real_time | 12109727 | 8189719 | 48% | | QGEMM/UnsignedANoPackB/M:384/N:4096/K:1024/Batch:1/Threads:4/real_time | 11787607 | 7926226 | 49% |
Author
Parents
Loading