PR #4240 llama : improve batched CPU perf with BLAS

llama : improve batched CPU perf with BLAS #4240

ggerganov merged 3 commits into master from gg/fix-cpu-blas

ggml : use blas even if src0 is not F32

f815fe43

llama : use n_threads_batch only when n_tokens >= 32

e9b7a5cb

llama : revert n_threads_batch logic

87f4102a

ggerganov force pushed to 87f4102a 2 years ago

ggerganov merged 8406b092 into master 2 years ago

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Milestone

No milestone