llama.cpp
ggml-cpu: Optimized x86 and generic cpu q1_0 dot (follow up)
#21636
Merged

ggml-cpu: Optimized x86 and generic cpu q1_0 dot (follow up) #21636

pl752
pl752 Implemented optimized q1_0 dot for x86 and generic
195593bc
pl752 Removed redundant helper definition
e29cd486
pl752 pl752 marked this pull request as ready for review 58 days ago
pl752 pl752 requested a review from ggerganov ggerganov 58 days ago
pl752
khosravipasha
pl752 Removed two redundant instructions from AVX q1_0 dot
8587b5cc
pl752
github-actions github-actions added ggml
pl752 Fixed inconsistency with fp16 conversion for generic q1_0 dot and ded…
0c4fb41f
pl752 Style cleanup around AVX q1_0 dot
7f82cf0c
pl752
khosravipasha
pl752 pl752 changed the title (Performance; ggml-cpu) Optimized x86 and generic cpu q1_0 dot (follow up) ggml-cpu: Optimized x86 and generic cpu q1_0 dot (follow up) 55 days ago
pl752
zcattacz
pl752
am17an
am17an approved these changes on 2026-04-13
pl752 Replaced explicitly unrolled blocks with inner for loop for q1_0
67f8d32d
pl752
khosravipasha
pl752
khosravipasha
pl752
pl752 Replaced scalar ARM q1_0 impl with new generic one
715f62ac
pl752
am17an
am17an approved these changes on 2026-04-15
khosravipasha
ggerganov
ggerganov approved these changes on 2026-04-20
ggerganov ggerganov merged 7f251fdb into master 46 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone