llama.cpp
a6d1189f - k_quants tuning for Falcon-7b (#2816)

Commit
2 years ago
k_quants tuning for Falcon-7b (#2816) * Make ggml-cuda.cu build with QK_K = 64 Using LLAMA_CUDA_FORCE_DMMV = ON and -nommq it runs and produces a meaningful result. * k_quants tuning for Falcon-7b --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Author
Parents
Loading