llama.cpp
946796fc
- ggml-cuda : slight optimizations for TQ2_0
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
271 days ago
ggml-cuda : slight optimizations for TQ2_0 Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
References
#11183 - ggml-cuda : add TQ2_0 kernels, for ternary inference on GPU
Author
compilade
Committer
compilade
Parents
f5fddb6d
Loading