onnxruntime
[x86] matmul8bit memory loading perf tuning
#24732
Merged

[x86] matmul8bit memory loading perf tuning #24732

snnn merged 2 commits into main from fajin/matmul8bit_x64_perf
fajin-corp
fajin-corp use aligned loading. 10% token generation speed up
7f361990
fajin-corp use preloading for 128 blksize
1d0bb06c
fajin-corp fajin-corp requested a review 363 days ago
liqunfu
liqunfu approved these changes on 2025-05-13
hanbitmyths
hanbitmyths approved these changes on 2025-05-15
yufenglee
yufenglee
yufenglee approved these changes on 2025-05-15
snnn snnn merged fd3e0e84 into main 360 days ago
snnn snnn deleted the fajin/matmul8bit_x64_perf branch 360 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone