llama.cpp
eaa13a48 - falcon : fix CUDA inference by making K and Q contiguous (#2830)

Commit
2 years ago
falcon : fix CUDA inference by making K and Q contiguous (#2830) * falcon : fix CUDA inference by making K and Q contiguous ggml-ci * cuda : add assert to guard from non-cont ropes
Author
Parents
Loading