llama.cpp
b881f630 - cuda : use mmv kernel for quantum cache ops

Commit

2 years ago

cuda : use mmv kernel for quantum cache ops

References

#4312 - llama : support quantum K cache

Author

ggerganov

ggerganov

Parents

Loading