llama : improve batched CPU perf with BLAS #4240
ggml : use blas even if src0 is not F32
f815fe43
llama : use n_threads_batch only when n_tokens >= 32
e9b7a5cb
llama : revert n_threads_batch logic
87f4102a
ggerganov
force pushed
to
87f4102a
2 years ago
ggerganov
merged
8406b092
into master 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub