ggml
bcdb75e3
- CUDA: faster q8_0 -> f16 dequantization (llama/4895)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
CUDA: faster q8_0 -> f16 dequantization (llama/4895)
Author
JohannesGaessler
Committer
ggerganov
Parents
400c07f0
Loading