llama.cpp
f578b86b - move BLAS to a separate backend (#6210)

Commit

1 year ago

move BLAS to a separate backend (#6210) * move BLAS to a separate backend * rename GGML_USE_OPENBLAS to GGML_USE_BLAS * alloc : reuse same buffer when the same buffer type if used multiple times * set number of threads automatically for openblas and blis * sched : print assignments when GGML_SCHED_DEBUG env variable is set * sched : allow ops with weights on an incompatible buffer type This will cause the weight to be copied to a backend that supports the op, which is very costly. The weight should have been stored in a buffer of a backend that can run the op, but llama.cpp cannot do this automatically at the moment. --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

References

#6210 - move BLAS to a separate backend

Author

slaren

Parents

1c641e6a

Files17

CMakeLists.txt
Makefile
examples/llama-bench
- llama-bench.cpp
ggml-alloc.c
ggml-backend-impl.h
ggml-backend.c
ggml-backend.h
ggml-blas.cpp
ggml-blas.h
ggml-cuda.cu
ggml-kompute.cpp
ggml-metal.m
ggml-rpc.cpp
ggml-sycl.cpp
ggml-vulkan.cpp
ggml.c
llama.cpp

llama.cpp f578b86b - move BLAS to a separate backend (#6210)

llama.cpp
f578b86b - move BLAS to a separate backend (#6210)