llama.cpp
CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K)
#7860
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
1
Changes
View On
GitHub
CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K)
#7860
JohannesGaessler
merged 1 commit into
ggml-org:master
from
JohannesGaessler:cuda-ptx-mma-12
JohannesGaessler
force pushed
from
48ecafb2
to
dc0ef0c4
1 year ago
slaren
commented on 2024-06-10
CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K)
8cb2dbd1
github-actions
added
Nvidia GPU
github-actions
added
ggml
JohannesGaessler
force pushed
from
dc0ef0c4
to
8cb2dbd1
1 year ago
slaren
approved these changes on 2024-06-10
mofosyne
added
Review Complexity : High
JohannesGaessler
merged
bdcb8f42
into master
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
slaren
Assignees
No one assigned
Labels
Nvidia GPU
Review Complexity : High
ggml
Milestone
No milestone
Login to write a write a comment.
Login via GitHub