llama.cpp
ggml : add Q5_0 and Q5_1 quantization
#1187
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
10
Changes
View On
GitHub
ggml : add Q5_0 and Q5_1 quantization
#1187
ggerganov
merged 10 commits into
master
from
q5_0
ggerganov
added
high priority
ggerganov
added
generation quality
ikawrakow
commented on 2023-04-26
ggerganov
force pushed
2 years ago
ggerganov
changed the title
ggml : add Q5_0 quantization
ggml : add Q5_0 and Q5_1 quantization
2 years ago
ggml : add Q5_0 quantization (cuBLAS only)
5bebc0a6
ggml : fix Q5_0 qh -> uint32_t
2576c16f
ggml : fix q5_0 histogram stats
99238e4c
ggml : q5_0 scalar dot product
ef8e3ee6
ggml : q5_0 ARM NEON dot
b294b7fd
ggml : q5_0 more efficient ARM NEON using uint64_t masks
d390f4f7
ggml : rename Q5_0 -> Q5_1
b9c43584
ggml : adding Q5_0 mode
8e936ad0
quantize : add Q5_0 and Q5_1 to map
982bfce6
ggerganov
force pushed
to
982bfce6
2 years ago
ggerganov
marked this pull request as ready for review
2 years ago
ggml : AVX2 optimizations for Q5_0, Q5_1 (#1195)
2bfa1fe8
ggerganov
merged
574406dc
into master
2 years ago
ggerganov
deleted the q5_0 branch
2 years ago
mofosyne
added
Tensor Encoding Scheme
mofosyne
added
Review Complexity : High
Login to write a write a comment.
Login via GitHub
Reviewers
ikawrakow
Assignees
No one assigned
Labels
high priority
generation quality
Review Complexity : High
Tensor Encoding Scheme
Milestone
No milestone
Login to write a write a comment.
Login via GitHub