llama.cpp
CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K)
#7860
Merged

CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K) #7860

JohannesGaessler
JohannesGaessler JohannesGaessler force pushed from 48ecafb2 to dc0ef0c4 1 year ago
slaren
slaren commented on 2024-06-10
JohannesGaessler CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K)
8cb2dbd1
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
JohannesGaessler JohannesGaessler force pushed from dc0ef0c4 to 8cb2dbd1 1 year ago
slaren
Green-Sky
JohannesGaessler
slaren
slaren approved these changes on 2024-06-10
mofosyne mofosyne added Review Complexity : High
ggerganov
JohannesGaessler JohannesGaessler merged bdcb8f42 into master 1 year ago
JohannesGaessler
sorasoras
JohannesGaessler
Dampfinchen
JohannesGaessler
Dampfinchen
JohannesGaessler
Dampfinchen
JohannesGaessler

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone