llama.cpp
Make IQ4_NL quantization be the same on CPU/CUDA/Metal when quantizing K-cache
#6196
Merged

Make IQ4_NL quantization be the same on CPU/CUDA/Metal when quantizing K-cache #6196

ggerganov merged 3 commits into master from ik/fix_k_cache_backend_tests
ikawrakow
Make quantize_row_iq4_nl do the same thing is quantization on CUDA
cd4a7c4c
Make quantize_row_iq4_nl do the same thing is quantization on CUDA
30eef31b
Now fix test-quantize-fns
68e4fed4
Vaibhavs10
Vaibhavs10 approved these changes on 2024-03-21
ggerganov
ggerganov approved these changes on 2024-03-21
ggerganov ggerganov merged cfd3be76 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone