llama.cpp
ggml-cuda: ds_read_b128 for q4_0 and q4_1 mmq kernels
#21168
Merged

ggml-cuda: ds_read_b128 for q4_0 and q4_1 mmq kernels #21168

JohannesGaessler merged 11 commits into ggml-org:master from DENEB1312:master
iacopPBK
iacopPBK ds_read_b128 for q4_0 and q4_1 mmq kernels
495c3632
iacopPBK iacopPBK requested a review 12 days ago
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
JohannesGaessler
JohannesGaessler commented on 2026-03-30
JohannesGaessler
iacopPBK Vectorized lds load update: used ggml_cuda_get_max_cpy_bytes and ggml…
cc9ea913
iacopPBK
JohannesGaessler
JohannesGaessler commented on 2026-03-30
iacopPBK Explicit for loop in mmq, renamed vec into tmp
62c2f8f7
JohannesGaessler
JohannesGaessler commented on 2026-03-30
iacopPBK Fixed max_cpy usage in the loading loop
0bcddd21
iacopPBK Fixed typo in q4_1 kernel
5d7df5df
JohannesGaessler
JohannesGaessler commented on 2026-04-01
iacopPBK Update ggml/src/ggml-cuda/mmq.cuh
d3065542
iacopPBK Update ggml/src/ggml-cuda/mmq.cuh
777f5943
iacopPBK Update ggml/src/ggml-cuda/mmq.cuh
fbc4cfcd
unverbraucht
skyne98
JohannesGaessler
JohannesGaessler
skyne98
iacopPBK Renoved trailing white line 500
b9a6e49b
iacopPBK
JohannesGaessler
iacopPBK Update mmq.cuh removed other whitelines
ce4c2a23
iacopPBK
iacopPBK
Remove trailing whitespaces
bc7b30ff
iacopPBK
JohannesGaessler
pwilkin
pwilkin approved these changes on 2026-04-07
JohannesGaessler
JohannesGaessler approved these changes on 2026-04-07
JohannesGaessler JohannesGaessler merged 66c4f9de into master 3 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone