llama.cpp
76d66ee0
- CUDA: faster q2_K, q3_K MMQ + int8 tensor cores (#7921)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
CUDA: faster q2_K, q3_K MMQ + int8 tensor cores (#7921) * CUDA: faster q2_K, q3_K MMQ + int8 tensor cores * try CI fix * try CI fix * try CI fix * fix data race * rever q2_K precision related changes
References
#7921 - CUDA: faster q2_K, q3_K MMQ + int8 tensor cores
Author
JohannesGaessler
Parents
66ef1cee
Loading