onnxruntime
[MLAS] add q4 quantize and transpose kernel to support MatMulNBits QDQ fuse
#21054
Merged

[MLAS] add q4 quantize and transpose kernel to support MatMulNBits QDQ fuse #21054

fajin-corp merged 25 commits into main from fajin/qdqmatmulnbitskkernels
fajin-corp
fajin-corp fajin-corp requested a review 1 year ago
fajin-corp fajin-corp force pushed from 14fcdb77 to ed9421a8 1 year ago
yufenglee
yufenglee commented on 2024-06-18
yufenglee
yufenglee commented on 2024-06-18
yufenglee
yufenglee commented on 2024-06-18
yufenglee
yufenglee commented on 2024-06-18
yufenglee
yufenglee commented on 2024-06-18
yufenglee
yufenglee
yufenglee
azure-pipelines
azure-pipelines
azure-pipelines
github-advanced-security
github-advanced-security commented on 2024-06-19
github-advanced-security
github-advanced-security commented on 2024-06-19
yufenglee
yufenglee dismissed these changes on 2024-06-19
fajin-corp adding qdq quantize for matmulnbits
719f8a77
fajin-corp added quantizeColumnWise
17202c9f
fajin-corp updating
5cfd04f1
fajin-corp added aligning limit and updated quantizeColumnWise
142fa90d
fajin-corp finished transpose
69a3391e
fajin-corp added headers and py binding
95aae782
fajin-corp fix build error
e3a850c3
fajin-corp adding unaligned code
531f20b7
fajin-corp limit to 4 bits, and separate out pack aligned and unaligned
5bc92eac
fajin-corp refactored QuantizeColumnWisePackAligned
2d17f5cd
fajin-corp finished quantize pack unaligned
6d8c90ce
fajin-corp updated TransposeColumnWiseQuantizedPackAligned
8d8244c6
fajin-corp finished TransposeColumnWiseQuantizedPackUnaligned
6fa950de
fajin-corp fixed one opNotLastAxis
b9904a16
fajin-corp fixed bloked Q 4bit multithread bug
31852ab0
fajin-corp fix build
080e41e1
fajin-corp fix build
181723d7
fajin-corp pass ut
1138dd61
fajin-corp finished benchmarking
9acdede3
fajin-corp added odd N to benchmark
87ba147a
fajin-corp fix ci build
d1dac08d
fajin-corp update bechmark to fix linux build
ab48c1eb
fajin-corp fix ci lint error
e0ba0695
fajin-corp resolve comments
c3f3b5e9
fajin-corp fix ci warning
44c0115e
fajin-corp fajin-corp dismissed their stale review via 44c0115e 1 year ago
fajin-corp fajin-corp force pushed from a4fe4c50 to 44c0115e 1 year ago
yufenglee
yufenglee approved these changes on 2024-06-19
fajin-corp
azure-pipelines
fajin-corp fajin-corp merged 6817b013 into main 1 year ago
fajin-corp fajin-corp deleted the fajin/qdqmatmulnbitskkernels branch 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone