[x86] matmul8bit memory loading perf tuning #24732
use aligned loading. 10% token generation speed up
7f361990
use preloading for 128 blksize
1d0bb06c
liqunfu
approved these changes
on 2025-05-13
yufenglee
approved these changes
on 2025-05-15
snnn
merged
fd3e0e84
into main 360 days ago
snnn
deleted the fajin/matmul8bit_x64_perf branch 360 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub