llama.cpp
Make i-quants work with super-blocks of 64 (CPU and Metal)
#5760
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
6
Changes
View On
GitHub
Make i-quants work with super-blocks of 64 (CPU and Metal)
#5760
ggerganov
merged 6 commits into
master
from
ik/i-quants-64
WIP: make i-quants work for QK_K = 64
13ba37f1
iq2_xs: attempt to fix AVX dot product for QK_K = 64
28e6146c
QK_K = 64 tests pass on ARM_NEON and Metal
de64e061
Make CUDA compile with QK_K = 64
2540a290
Q2_K: fixed bug in imatrix quantization for QK_K = 64
47d52b2b
iq1_s: turn off SIMD implementation for QK_K = 64 (it does not work)
f0cbb6dd
ggerganov
approved these changes on 2024-02-28
ggerganov
merged
7c4263d4
into master
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
ggerganov
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub