llama.cpp
34a846b5 - opencl: fix for small models (#11950)

Commit
256 days ago
opencl: fix for small models (#11950) * opencl: fix small shape gemv, remove unused extensions * opencl: fix `transpose_16`, `dump_tensor`, enforce subgroup size * opencl: fix for token length < 4 * opencl: use wave size of 64 for all Adreno GPUs --------- Co-authored-by: Shawn Gu <quic_shawngu@quicinc.com> Co-authored-by: Skyler Szot <quic_sszot@quicinc.com>
Author
Parents
Loading