onnxruntime
[CPU EP] Refactor MatMulNBits to decouple type implementation
#22140
Merged

[CPU EP] Refactor MatMulNBits to decouple type implementation #22140

fajin-corp merged 6 commits into main from fajin/nbmmfp16cvt
fajin-corp
fajin-corp refactored MatMulNBits compute to separate implementation for differe…
390678d1
fajin-corp move compute type to class fields
2c53218f
fajin-corp add specialization for repack scale
e3a5b92d
fajin-corp fix build
b37cd5ce
fajin-corp fix ut
c08769aa
fajin-corp fix linux build
b5799bf6
yufenglee
yufenglee commented on 2024-09-19
yufenglee
yufenglee approved these changes on 2024-09-20
fajin-corp fajin-corp merged b0ef1f39 into main 1 year ago
fajin-corp fajin-corp deleted the fajin/nbmmfp16cvt branch 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone