llama.cpp
cuda: optimize iq2xxs/iq2xs/iq3xxs dequantization
#19624
Merged

Loading