llama : make quantize example up to 2.7x faster #3115
cebtenzzre
marked this pull request as draft 2 years ago
cebtenzzre
marked this pull request as ready for review 2 years ago
slaren
commented
on 2023-09-11
llama : refactor k-quant mixture logic into a function
0c649684
llama : optimize vector use in quantize -> 179% faster
a95aa21d
ggerganov
approved these changes
on 2023-09-14
cebtenzzre
changed the title llama : make quantize example up to 3.7x faster llama : make quantize example up to 2.7x faster 2 years ago
llama : don't zero-init vectors in quantize -> 5.1% faster
f727ad5f
cebtenzzre
merged
98311c42
into master 2 years ago
cebtenzzre
deleted the faster-quantize branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub