llama.cpp
ggml-cuda: ds_read_b128 for q4_0 and q4_1 mmq kernels
#21168
Merged

Loading