llama.cpp
2540a290
- Make CUDA compile with QK_K = 64
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
Make CUDA compile with QK_K = 64 Tests don't pass, plus we get misaligned access
References
#5760 - Make i-quants work with super-blocks of 64 (CPU and Metal)
Author
Iwan Kawrakow
Parents
de64e061
Loading