llama.cpp
sycl : port multi-column MMVQ from CUDA backend (~45% speculative decoding speedup on Intel Arc)
#21845
Merged

sycl : port multi-column MMVQ from CUDA backend (~45% speculative decoding speedup on Intel Arc) #21845

masonmilby
masonmilby masonmilby requested a review 54 days ago
github-actions github-actions added ggml
github-actions github-actions added SYCL
arthw
masonmilby
masonmilby masonmilby changed the title sycl : port multi-column MMVQ from CUDA backend (~75% speculative decoding speedup on Intel Arc) sycl : port multi-column MMVQ from CUDA backend (~45% speculative decoding speedup on Intel Arc) 51 days ago
arthw
arthw
masonmilby
NeoZhangJianyu
masonmilby
NeoZhangJianyu
masonmilby
NeoZhangJianyu
arthw
arthw commented on 2026-05-09
masonmilby
masonmilby masonmilby marked this pull request as draft 28 days ago
arthw
R-SITES
masonmilby sycl : port multi-column MMVQ from CUDA backend
113d79e3
masonmilby masonmilby force pushed from d5ca0928 to 113d79e3 2 days ago
masonmilby
R-SITES
masonmilby
masonmilby masonmilby marked this pull request as ready for review 1 day ago
masonmilby masonmilby requested a review from arthw arthw 1 day ago
arthw
arthw approved these changes on 2026-06-05
masonmilby
arthw arthw added merge ready
NeoZhangJianyu
ggerganov ggerganov merged 7fe2ae45 into master 1 day ago
tac39us-stack
masonmilby
sheigl

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone