llama.cpp
61af07c2 - ggml-zendnn : adaptive fallback to CPU backend for small batch sizes (#22681)

Commit
1 day ago
ggml-zendnn : adaptive fallback to CPU backend for small batch sizes (#22681) * ggml-zendnn : add runtime env var GGML_ZENDNN_ADAPTIVE_FALLBACK to control adaptive fallback (default: enabled) * ggml-zendnn : restore original fallback logic when adaptive fallback is disabled
Author
Parents
Loading