llama.cpp
f578b86b - move BLAS to a separate backend (#6210)

Commit
1 year ago
move BLAS to a separate backend (#6210) * move BLAS to a separate backend * rename GGML_USE_OPENBLAS to GGML_USE_BLAS * alloc : reuse same buffer when the same buffer type if used multiple times * set number of threads automatically for openblas and blis * sched : print assignments when GGML_SCHED_DEBUG env variable is set * sched : allow ops with weights on an incompatible buffer type This will cause the weight to be copied to a backend that supports the op, which is very costly. The weight should have been stored in a buffer of a backend that can run the op, but llama.cpp cannot do this automatically at the moment. --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Author
Parents
  • File
    CMakeLists.txt
  • File
    Makefile
  • examples/llama-bench
    • File
      llama-bench.cpp
  • File
    ggml-alloc.c
  • File
    ggml-backend-impl.h
  • File
    ggml-backend.c
  • File
    ggml-backend.h
  • File
    ggml-blas.cpp
  • File
    ggml-blas.h
  • File
    ggml-cuda.cu
  • File
    ggml-kompute.cpp
  • File
    ggml-metal.m
  • File
    ggml-rpc.cpp
  • File
    ggml-sycl.cpp
  • File
    ggml-vulkan.cpp
  • File
    ggml.c
  • File
    llama.cpp