llama.cpp
GPU-accelerated token generation (new quantization format)
#1412
Merged

Loading