llama.cpp
ggml-cuda : add TQ2_0 kernels, for ternary inference on GPU
#11183
Open
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
7
Changes
View On
GitHub
Commits
ggml-cuda : add TQ2_0 support
compilade
committed
1 year ago
ggml-cuda : cleanup TQ2_0
compilade
committed
1 year ago
Merge branch 'master' into compilade/cuda-tq2_0
compilade
committed
1 year ago
ggml-cuda : remove some superfluous comments for TQ2_0 tile loading
compilade
committed
1 year ago
ggml-cuda : slight optimizations for TQ2_0
compilade
committed
1 year ago
ggml-metal : supports_op returns false for ternary types
compilade
committed
1 year ago
ggml-cuda : use i and j instead of i0 and i in vec_dot_tq2_0_q8_1
compilade
committed
1 year ago
Loading