llama.cpp
llama : make quantize example up to 2.7x faster
#3115
Merged

llama : make quantize example up to 2.7x faster #3115

cebtenzzre
cebtenzzre
cebtenzzre cebtenzzre marked this pull request as draft 2 years ago
cebtenzzre cebtenzzre marked this pull request as ready for review 2 years ago
slaren
slaren commented on 2023-09-11
ggerganov
ggerganov commented on 2023-09-11
ggerganov
ggerganov commented on 2023-09-11
ggerganov ggerganov added high priority
bobqianic
ikawrakow
cebtenzzre
cebtenzzre
cebtenzzre
ikawrakow
cebtenzzre
cebtenzzre cebtenzzre force pushed 2 years ago
cebtenzzre
ikawrakow
ikawrakow
ikawrakow
cebtenzzre
ikawrakow
cebtenzzre
ikawrakow
ggerganov
cebtenzzre
cebtenzzre llama : refactor k-quant mixture logic into a function
0c649684
cebtenzzre llama : optimize vector use in quantize -> 179% faster
a95aa21d
cebtenzzre cebtenzzre force pushed 2 years ago
ikawrakow
ggerganov
ggerganov approved these changes on 2023-09-14
cebtenzzre
cebtenzzre cebtenzzre changed the title llama : make quantize example up to 3.7x faster llama : make quantize example up to 2.7x faster 2 years ago
cebtenzzre llama : don't zero-init vectors in quantize -> 5.1% faster
f727ad5f
cebtenzzre cebtenzzre force pushed to f727ad5f 2 years ago
cebtenzzre cebtenzzre merged 98311c42 into master 2 years ago
cebtenzzre cebtenzzre deleted the faster-quantize branch 2 years ago
cebtenzzre cebtenzzre restored the head branch 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone