llama.cpp
780e24a2 - ggml : parallelize FP32 conversion when using BLAS (#5045)

Commit
2 years ago
ggml : parallelize FP32 conversion when using BLAS (#5045) * make GGML_TASK_INIT phase can be run in multithread * multithreaded dequantize in mul_mat when using blas library * minor fixes * update outdated comment * fix coding style * simplify code Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Author
Parents
Loading