llama.cpp
CUDA: stream-k decomposition for MMQ
#8018
Merged

CUDA: stream-k decomposition for MMQ #8018

JohannesGaessler
JohannesGaessler CUDA: stream-k decomposition for MMQ
da1db13d
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
JohannesGaessler JohannesGaessler added Review Complexity : High
ggerganov
JohannesGaessler
JohannesGaessler
slaren
slaren
JohannesGaessler fix undefined memory reads for small matrices
141d0810
JohannesGaessler
slaren
slaren approved these changes on 2024-06-20
slaren
JohannesGaessler
JohannesGaessler JohannesGaessler merged d50f8897 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone