sycl : port multi-column MMVQ from CUDA backend (~45% speculative decoding speedup on Intel Arc) #21845
masonmilby
changed the title sycl : port multi-column MMVQ from CUDA backend (~75% speculative decoding speedup on Intel Arc) sycl : port multi-column MMVQ from CUDA backend (~45% speculative decoding speedup on Intel Arc) 51 days ago
arthw
commented
on 2026-05-09
masonmilby
marked this pull request as draft 28 days ago
sycl : port multi-column MMVQ from CUDA backend
113d79e3
masonmilby
force pushed
from
d5ca0928
to
113d79e3
2 days ago
masonmilby
marked this pull request as ready for review 1 day ago
arthw
approved these changes
on 2026-06-05
ggerganov
merged
7fe2ae45
into master 1 day ago
Assignees
No one assigned
Labels
ggml
merge ready
SYCL
Login to write a write a comment.
Login via GitHub