llama.cpp
ggml-cpu: aarm64: q5_K repack gemm and gemv (and generic) implementations (i8mm)
#18860
Merged

ggml-cpu: aarm64: q5_K repack gemm and gemv (and generic) implementations (i8mm) #18860

Alcpz
Alcpz Alcpz requested a review from ggerganov ggerganov 14 days ago
Alcpz
github-actions github-actions added ggml
Alcpz
taronaeo
Alcpz Alcpz force pushed from b517a1d4 to f4a7a91d 13 days ago
Alcpz
Alcpz Boilerplate for q5_Kx8 REPACK on ARM and fallback
9b2129b0
Alcpz Implements make_block_q5_Kx8 by extending make_block_q4_Kx8
7d944e99
Alcpz q5_K repack gemm and gemv generics
5ea06c3a
Alcpz Gemm and Gemv ARM implementations (i8mm)
f5341c60
Alcpz Improved qh manipulation looking at non-repack vec_dot implementation
a8e2fdbd
Alcpz Full unroll
960689d2
Alcpz Apply Q5_K Gemv vand and vshl optimizations to gemm. Improve comments.
1d8c0bd8
Alcpz Fix wrong fallback definitions of Q5_K
f9582a66
Alcpz Fixed comments. Reverted unnecessary formatting
794e9ecd
Alcpz Fixed typo in generic definitions
d65e2eae
Alcpz Switching AND + Shift with Shift Insert. Better op interleaving.
a6e22819
Alcpz Vectorize + unroll the block scales
339734dc
Alcpz Apply gemm optimizations to gemv
365555de
Alcpz Improve bias calculation
69b24778
Alcpz Alcpz force pushed from e1f60b6e to 69b24778 7 days ago
ggerganov
ggerganov approved these changes on 2026-01-23
ggerganov ggerganov merged 091a46cb into master 6 days ago
Alcpz Alcpz deleted the Alcpz/arm_q5_K_repack branch 3 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone