onnxruntime
Implement new experimental lookup-based matrix multiplication method(TMAC)
#26695
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
52
Changes
View On
GitHub
Commits
init code structure for matmul 2 bits
liqunfu
committed
1 year ago
add and pass q4dq tests for q2bit - rename file and test name later
liqunfu
committed
1 year ago
some fixes
liqunfu
committed
1 year ago
add apis to neon and other avxs
liqunfu
committed
1 year ago
fix neon build
liqunfu
committed
1 year ago
disable 2bit test
liqunfu
committed
1 year ago
2 bit quantize to support model builder
liqunfu
committed
1 year ago
Merge remote-tracking branch 'msft/main' into carzh/bitnet-reverse-last-commit
carzh
committed
270 days ago
fix compile errors
carzh
committed
268 days ago
resolve build failure update
carzh
committed
267 days ago
2 bits check
HectorSVC
committed
262 days ago
fixed bug causing int8 tests to fail
carzh
committed
260 days ago
Merge remote-tracking branch 'origin/main' into carzh/bitnet-reverse-last-commit-new
carzh
committed
248 days ago
lintrunner
carzh
committed
248 days ago
prepack wip -- not prepacking b data because dispatch to check for mlas kernel not implemented for fp32. Also, I need to write the packing logic for the scales as well.
carzh
committed
242 days ago
fixed dispatch issue, added acc level 4 tests, and now running into assert issue with the data shuffling in prepack
carzh
committed
239 days ago
deep sigh
Caroline Zhu
committed
221 days ago
builds somehow
Caroline Zhu
committed
219 days ago
update
Caroline Zhu
committed
213 days ago
udpate
Caroline Zhu
committed
208 days ago
Implement Pre Packing of qweight for tmac
vraspar
committed
192 days ago
Implement Pre packing for Scales and zero points
vraspar
committed
187 days ago
Transform zero points before interleaving
vraspar
committed
187 days ago
Initial implementation of tmac kernel config
vraspar
committed
186 days ago
Move pre packing scales and zp code to qlutgemm and use tmac_params
vraspar
committed
186 days ago
update
Caroline Zhu
committed
181 days ago
bug fixes
Caroline Zhu
committed
178 days ago
Fix bug in scale unpacking
vraspar
committed
173 days ago
Fix issues with TMAC GEMM kernels and remove hard coded variables
vraspar
committed
165 days ago
Fix bug in LUT table generation
vraspar
committed
163 days ago
Fix casting issue
vraspar
committed
152 days ago
add session option and clean up
vraspar
committed
149 days ago
Refactor QNBit GEMM Implementation for AVX2
vraspar
committed
131 days ago
Refactor dispatch
vraspar
committed
130 days ago
Add test cases
vraspar
committed
130 days ago
rewrite test_sqlutgemm.cpp
vraspar
committed
122 days ago
Add more robust checking before using LUT kernels
vraspar
committed
122 days ago
Merge remote-tracking branch 'origin/main' into vraspar/lut-gemm
vraspar
committed
117 days ago
revert graph_transform_test.cc
vraspar
committed
117 days ago
Clean up: revert unchanged files
vraspar
committed
117 days ago
Apply linting and clean up
vraspar
committed
110 days ago
Add headers, update binding, and general clean up + linting
vraspar
committed
109 days ago
Fix zero point test cases
vraspar
committed
109 days ago
Refactor ComputeBPackedLUT to remove unused parameters and simplify function signature
vraspar
committed
99 days ago
Merge remote-tracking branch 'origin/main' into vraspar/lut-gemm
vraspar
committed
99 days ago
Fix compiler warnings
vraspar
committed
99 days ago
Improve error handling in TMACComputeGemm_avx2 for batch size and scale group size validation
vraspar
committed
99 days ago
Apply feedback and use PrePacking
vraspar
committed
94 days ago
update platform.cpp
vraspar
committed
94 days ago
use MLAS_THROW_EX for qlutgemm.cpp
vraspar
committed
92 days ago
Add LUT GEMM 2-bit tests and fix Python quantization reference implementation
Vrajang Parikh
committed
89 days ago
Merge remote-tracking branch 'origin/main' into vraspar/lut-gemm
vraspar
committed
88 days ago
Loading