llama.cpp
ggml-webgpu: updated matrix-vector multiplication
#21738
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
10
Changes
View On
GitHub
ggml-webgpu: updated matrix-vector multiplication
#21738
reeselevine
merged 10 commits into
ggml-org:master
from
reeselevine:k_quant_speedup
merged properly, but slow q3_k and q5_k with u32 indexing
3c36b556
neha-ha
requested a review
from
ggerganov
72 days ago
neha-ha
requested a review
72 days ago
github-actions
added
ggml
github-actions
added
WebGPU
Start on new mat-vec
3c9e474c
New format float paths working
0bcf75c1
Working q4_0
01bd9127
Work on remaining legacy q-types
f839c103
port k-quants to new matvec
ba961225
remove old shader
b4b6ffc4
Merge remote-tracking branch 'upstream/master' into k_quant_speedup
83a0d381
reeselevine
force pushed
from
41259410
to
83a0d381
65 days ago
Remove old constants, format
ca49e73a
reeselevine
approved these changes on 2026-04-17
reeselevine
requested a review
from
CISC
65 days ago
reeselevine
added
merge ready
CISC
approved these changes on 2026-04-17
remove accidental file
b92011ef
reeselevine
approved these changes on 2026-04-19
ggerganov
approved these changes on 2026-04-20
reeselevine
merged
a6cc43c2
into master
62 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
ggerganov
reeselevine
CISC
Assignees
No one assigned
Labels
ggml
merge ready
WebGPU
Milestone
No milestone
Login to write a write a comment.
Login via GitHub