onnxruntime
Implement new experimental lookup-based matrix multiplication method(TMAC)
#26695
Merged

Commits
  • init code structure for matmul 2 bits
    liqunfu committed 1 year ago
  • add and pass q4dq tests for q2bit - rename file and test name later
    liqunfu committed 1 year ago
  • some fixes
    liqunfu committed 1 year ago
  • add apis to neon and other avxs
    liqunfu committed 1 year ago
  • fix neon build
    liqunfu committed 1 year ago
  • disable 2bit test
    liqunfu committed 1 year ago
  • 2 bit quantize to support model builder
    liqunfu committed 1 year ago
  • Merge remote-tracking branch 'msft/main' into carzh/bitnet-reverse-last-commit
    carzh committed 270 days ago
  • fix compile errors
    carzh committed 268 days ago
  • resolve build failure update
    carzh committed 267 days ago
  • 2 bits check
    HectorSVC committed 262 days ago
  • fixed bug causing int8 tests to fail
    carzh committed 260 days ago
  • Merge remote-tracking branch 'origin/main' into carzh/bitnet-reverse-last-commit-new
    carzh committed 248 days ago
  • lintrunner
    carzh committed 248 days ago
  • prepack wip -- not prepacking b data because dispatch to check for mlas kernel not implemented for fp32. Also, I need to write the packing logic for the scales as well.
    carzh committed 242 days ago
  • fixed dispatch issue, added acc level 4 tests, and now running into assert issue with the data shuffling in prepack
    carzh committed 239 days ago
  • deep sigh
    Caroline Zhu committed 221 days ago
  • builds somehow
    Caroline Zhu committed 219 days ago
  • update
    Caroline Zhu committed 213 days ago
  • udpate
    Caroline Zhu committed 208 days ago
  • Implement Pre Packing of qweight for tmac
    vraspar committed 192 days ago
  • Implement Pre packing for Scales and zero points
    vraspar committed 187 days ago
  • Transform zero points before interleaving
    vraspar committed 187 days ago
  • Initial implementation of tmac kernel config
    vraspar committed 186 days ago
  • Move pre packing scales and zp code to qlutgemm and use tmac_params
    vraspar committed 186 days ago
  • update
    Caroline Zhu committed 181 days ago
  • bug fixes
    Caroline Zhu committed 178 days ago
  • Fix bug in scale unpacking
    vraspar committed 173 days ago
  • Fix issues with TMAC GEMM kernels and remove hard coded variables
    vraspar committed 165 days ago
  • Fix bug in LUT table generation
    vraspar committed 163 days ago
  • Fix casting issue
    vraspar committed 152 days ago
  • add session option and clean up
    vraspar committed 149 days ago
  • Refactor QNBit GEMM Implementation for AVX2
    vraspar committed 131 days ago
  • Refactor dispatch
    vraspar committed 130 days ago
  • Add test cases
    vraspar committed 130 days ago
  • rewrite test_sqlutgemm.cpp
    vraspar committed 122 days ago
  • Add more robust checking before using LUT kernels
    vraspar committed 122 days ago
  • Merge remote-tracking branch 'origin/main' into vraspar/lut-gemm
    vraspar committed 117 days ago
  • revert graph_transform_test.cc
    vraspar committed 117 days ago
  • Clean up: revert unchanged files
    vraspar committed 117 days ago
  • Apply linting and clean up
    vraspar committed 110 days ago
  • Add headers, update binding, and general clean up + linting
    vraspar committed 109 days ago
  • Fix zero point test cases
    vraspar committed 109 days ago
  • Refactor ComputeBPackedLUT to remove unused parameters and simplify function signature
    vraspar committed 99 days ago
  • Merge remote-tracking branch 'origin/main' into vraspar/lut-gemm
    vraspar committed 99 days ago
  • Fix compiler warnings
    vraspar committed 99 days ago
  • Improve error handling in TMACComputeGemm_avx2 for batch size and scale group size validation
    vraspar committed 99 days ago
  • Apply feedback and use PrePacking
    vraspar committed 94 days ago
  • update platform.cpp
    vraspar committed 94 days ago
  • use MLAS_THROW_EX for qlutgemm.cpp
    vraspar committed 92 days ago
  • Add LUT GEMM 2-bit tests and fix Python quantization reference implementation
    Vrajang Parikh committed 89 days ago
  • Merge remote-tracking branch 'origin/main' into vraspar/lut-gemm
    vraspar committed 88 days ago
Loading