llama.cpp
CUDA: use tensor cores for MMQ
#7676
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
4
Changes
View On
GitHub
CUDA: use tensor cores for MMQ
#7676
JohannesGaessler
merged 4 commits into
ggml-org:master
from
JohannesGaessler:cuda-ptx-mma-2
mofosyne
added
Review Complexity : High
github-actions
added
Nvidia GPU
github-actions
added
ggml
JohannesGaessler
force pushed
to
bf10e133
1 year ago
JohannesGaessler
marked this pull request as ready for review
1 year ago
CUDA: int8 tensor cores for MMQ (legacy quants)
bd89bb37
fix out-of-bounds writes
054d4ea9
JohannesGaessler
force pushed
to
054d4ea9
1 year ago
slaren
approved these changes on 2024-06-10
__builtin_assume -> GGML_CUDA_ASSUME
a9cde5c6
fix writeback returning too early
a64a81a2
JohannesGaessler
merged
1f0dabda
into master
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
slaren
Assignees
No one assigned
Labels
Nvidia GPU
Review Complexity : High
ggml
Milestone
No milestone
Login to write a write a comment.
Login via GitHub