llama.cpp
1cbf5614
- metal : new q4_0 matrix-vector kernel (#2188)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
metal : new q4_0 matrix-vector kernel (#2188) Prefetch data to improve GPU utilization. ~48% faster for 33B model.
References
#2188 - metal: new q4_0 mat-vec mul kernel
Author
lshzh-ww
Parents
975221e9
Loading