llama.cpp
ggml : add Q5_0 and Q5_1 quantization
#1187
Merged

ggml : add Q5_0 and Q5_1 quantization #1187

ggerganov merged 10 commits into master from q5_0
ggerganov
ggerganov ggerganov added high priority
ggerganov ggerganov added generation quality
ikawrakow
ikawrakow commented on 2023-04-26
ggerganov ggerganov force pushed 2 years ago
ggerganov ggerganov changed the title ggml : add Q5_0 quantization ggml : add Q5_0 and Q5_1 quantization 2 years ago
ggerganov ggml : add Q5_0 quantization (cuBLAS only)
5bebc0a6
ggerganov ggml : fix Q5_0 qh -> uint32_t
2576c16f
ggerganov ggml : fix q5_0 histogram stats
99238e4c
ggerganov ggml : q5_0 scalar dot product
ef8e3ee6
ggerganov ggml : q5_0 ARM NEON dot
b294b7fd
ggerganov ggml : q5_0 more efficient ARM NEON using uint64_t masks
d390f4f7
ggerganov ggml : rename Q5_0 -> Q5_1
b9c43584
ggerganov ggml : adding Q5_0 mode
8e936ad0
ggerganov quantize : add Q5_0 and Q5_1 to map
982bfce6
ggerganov ggerganov force pushed to 982bfce6 2 years ago
ggerganov ggerganov marked this pull request as ready for review 2 years ago
sw ggml : AVX2 optimizations for Q5_0, Q5_1 (#1195)
2bfa1fe8
sw
ggerganov
ggerganov ggerganov merged 574406dc into master 2 years ago
ggerganov ggerganov deleted the q5_0 branch 2 years ago
mofosyne mofosyne added Tensor Encoding Scheme
mofosyne mofosyne added Review Complexity : High

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone