llama.cpp
a6cc43c2 - ggml-webgpu: updated matrix-vector multiplication (#21738)

Commit
26 days ago
ggml-webgpu: updated matrix-vector multiplication (#21738) * merged properly, but slow q3_k and q5_k with u32 indexing * Start on new mat-vec * New format float paths working * Working q4_0 * Work on remaining legacy q-types * port k-quants to new matvec * remove old shader * Remove old constants, format * remove accidental file --------- Co-authored-by: Neha Abbas <nehaabbas@ReeseLevines-MacBook-Pro.local> Co-authored-by: Reese Levine <reeselevine1@gmail.com>
Author
Parents
Loading