llama.cpp
bbac6a26 - ggml: fix cuda kernel launch configuration for k_compute_batched_ptrs to support large batch (#16744)

Commit
56 days ago
ggml: fix cuda kernel launch configuration for k_compute_batched_ptrs to support large batch (#16744) * fix k_compute_batched_ptrs * add backend ops test * Update ggml/src/ggml-cuda/ggml-cuda.cu Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * reduce the batch size --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
Author
Parents
Loading