llama.cpp
49af767f - build : add compile option to force use of MMQ kernels

Commit

2 years ago

build : add compile option to force use of MMQ kernels

References

cuda-quantum-batch

#3776 - cuda : improve text-generation and batched decoding performance

Author

ggerganov

ggerganov

Parents

Loading