llama.cpp
808aba39 - CUDA: optimize and refactor MMQ (#8416)

Commit
1 year ago
CUDA: optimize and refactor MMQ (#8416) * CUDA: optimize and refactor MMQ * explicit q8_1 memory layouts, add documentation
Parents
Loading