llama.cpp
a6cc43c2 - ggml-webgpu: updated matrix-vector multiplication (#21738)

Commit

75 days ago

ggml-webgpu: updated matrix-vector multiplication (#21738) * merged properly, but slow q3_k and q5_k with u32 indexing * Start on new mat-vec * New format float paths working * Working q4_0 * Work on remaining legacy q-types * port k-quants to new matvec * remove old shader * Remove old constants, format * remove accidental file --------- Co-authored-by: Neha Abbas <nehaabbas@ReeseLevines-MacBook-Pro.local> Co-authored-by: Reese Levine <reeselevine1@gmail.com>

References

#21738 - ggml-webgpu: updated matrix-vector multiplication

Author

neha-ha

Parents

a6789166

llama.cpp a6cc43c2 - ggml-webgpu: updated matrix-vector multiplication (#21738)

llama.cpp
a6cc43c2 - ggml-webgpu: updated matrix-vector multiplication (#21738)