llama.cpp
658987cf - CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (#13014)

Commit
143 days ago
CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (#13014) * CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID * fix logic for RoPE support, CUDA graphs
Parents
Loading