llama.cpp
ggml : remove bit shuffling
#1405
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
32
Changes
View On
GitHub
ggml : remove bit shuffling
#1405
ggerganov
merged 32 commits into
master
from
remove-vzip-2
ggerganov
force pushed
3 years ago
ggerganov
marked this pull request as ready for review
3 years ago
ggerganov
requested a review
from
sw
3 years ago
ggml : remove Q4_0 bit shufling (ARM NEON)
5fa47bf6
ggml : remove Q4_1 bit shuffling (ARM NEON + reference)
844d2af8
ggml : nibbles_from_floats() + bytes_from_nibbles() (ARM NEON)
fd2a137f
ggml : remove Q4_2 bit shuffling (WIP, BROKEN)
9f3285f7
ggml : remove Q5_0 bit shuffling (ARM NEON)
aa78dfed
ggml : 2x faster scalar implementations
b37a08f6
ggml : remove Q5_1 bit shuffling (ARM NEON + scalar)
292a778c
ggml : simplify scalar dot
caaacd57
ggml : remove WASM SIMD bit shuffling + remove vzip for ARM 32-bit
0add6402
ggml : fix Q4_1 quantization
9472d0ea
ggml : update cuBLAS + normalize variable names
cdc96073
ggml : remove Q4_2 mode
4bf1c8a4
ggml : minor formatting
b08c39b1
ggml : fix Q5_0 quantization
83674556
scripts : add script for measuring the time per token
928d2f33
AVX implementations (#1370)
9e49d201
ggml : uniform 5th bit extraction
489bd13f
llama : produce error upon loading old model files
d52172a5
llama : fix model magic/version write
09032e02
ggml : speed-up Q5_0 + Q5_1 at 4 threads
b7ad385d
ggml : preserve old Q4 and Q5 formats
695f3963
ggml : simplify Q8_1 - no need for low / high sums anymore
582a39ff
ggml : fix Q8_0 and Q8_1 rounding
66802448
Revert "AVX implementations (#1370)"
bd5e3730
ggml : fix AVX2 implementation
5bc286ab
sha : update hashes for 7B and 13B
e038e01e
ggerganov
force pushed
to
e038e01e
3 years ago
readme : update timings + remove warning banner
51c25fd9
llama : update v2 PR number to 1405
1c87847b
ggml : fix WASM comments
832c53f4
sw
commented on 2023-05-11
ggml : back to original bit order
ca7f069f
ggerganov
changed the title
ggml : remove bit shuffling (non-breaking)
ggml : remove bit shuffling
3 years ago
ggerganov
added
performance
ggerganov
added
breaking change
readme : add note that Q4 and Q5 have been changed
b58b1f4b
llama : fix return for unknown version
cbb6a3a7
ggerganov
merged
b9fd7eee
into master
3 years ago
ggerganov
deleted the remove-vzip-2 branch
3 years ago
Excigma
commented on 2023-05-11
BobbyKay1
approved these changes on 2023-12-10
Login to write a write a comment.
Login via GitHub
Reviewers
BobbyKay1
Excigma
sw
Assignees
No one assigned
Labels
performance
breaking change
Milestone
No milestone
Login to write a write a comment.
Login via GitHub