llama.cpp
245fc3c3 - metal : faster q4_0 (#1775)

Commit
2 years ago
metal : faster q4_0 (#1775) * metal : 8% faster q4_0 Avoid copying into local uchar4 anf float4. * metal : 17% faster Q4_0 Use 64 threads in a thread group. --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Author
Parents
Loading