Metal when quantizing K-cache
#6196

Merged

Make IQ4_NL quantization be the same on CPU/CUDA/Metal when quantizing K-cache #6196

ggerganov merged 3 commits into master from ik/fix_k_cache_backend_tests

Make quantize_row_iq4_nl do the same thing is quantization on CUDA

cd4a7c4c

Make quantize_row_iq4_nl do the same thing is quantization on CUDA

30eef31b

Now fix test-quantize-fns

68e4fed4

Vaibhavs10 approved these changes on 2024-03-21

ggerganov approved these changes on 2024-03-21

ggerganov merged cfd3be76 into master 2 years ago

Reviewers

ggerganov

Vaibhavs10

Assignees

No one assigned

Labels

None yet

Milestone

No milestone