llama.cpp
ggml : use F16 instead of F32 in Q4_0, Q4_1, Q8_0
#1508
Merged

ggml : use F16 instead of F32 in Q4_0, Q4_1, Q8_0 #1508

ggerganov merged 5 commits into master from qnt-f16
ggerganov
ggerganov ggml : use F16 instead of F32 in Q4_0, Q4_1 and Q8_0
d627025c
ggerganov ggerganov added performance
ggerganov ggerganov added breaking change
LostRuins
Green-Sky
j-f1
ivanstepanovftw
ilyakurdyukov
ilyakurdyukov
KerfuffleV2
ggerganov llama : bump LLAMA_FILE_VERSION to 3
3094f642
ggerganov cuda : update Q4 and Q8 dequantize kernels
8b713297
ggerganov ggml : fix AVX dot products
a4434975
ggerganov
KerfuffleV2
ggerganov readme : update performance table + hot topics
c6d82555
ggerganov ggerganov merged 2d5db483 into master 3 years ago
ggerganov ggerganov deleted the qnt-f16 branch 3 years ago
YellowRoseCx
SlyEcho
Green-Sky
Rhialto
KerfuffleV2
EwoutH
Dwedit
KerfuffleV2

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone