llama.cpp
658987cf
- CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (#13014)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
143 days ago
CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (#13014) * CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID * fix logic for RoPE support, CUDA graphs
References
#13014 - CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID
Author
JohannesGaessler
Parents
dc39a5e7
Loading