llama.cpp
9725a313 - CUDA: reduce MMQ stream-k overhead (#22298)

Commit
22 days ago
CUDA: reduce MMQ stream-k overhead (#22298) * CUDA: reduce MMQ stream-k overhead * use 32 bit integers for kbc
Parents
Loading