llama.cpp
CUDA: Fix loop unrolling for BW in mul_mat_q_stream_k_fixup
#19053

Merged

CUDA: Fix loop unrolling for BW in mul_mat_q_stream_k_fixup #19053

JohannesGaessler merged 1 commit into ggml-org:master from ORippler:osimons/fix_bw_mmq_fixup_kernel

CUDA: Fix loop unrolling for BW in mul_mat_q_stream_k_fixup

390146ec

ORippler requested a review from

JohannesGaessler 14 days ago

github-actions added Nvidia GPU

github-actions added ggml

JohannesGaessler approved these changes on 2026-02-03

JohannesGaessler merged 1f1e57f2 into master 4 days ago

ORippler deleted the osimons/fix_bw_mmq_fixup_kernel branch 4 days ago

Reviewers

JohannesGaessler

Assignees

No one assigned

Labels

Nvidia GPU ggml

Milestone

No milestone