llama.cpp
bbac6a26 - ggml: fix cuda kernel launch configuration for k_compute_batched_ptrs to support large batch (#16744)

Commit

104 days ago

ggml: fix cuda kernel launch configuration for k_compute_batched_ptrs to support large batch (#16744) * fix k_compute_batched_ptrs * add backend ops test * Update ggml/src/ggml-cuda/ggml-cuda.cu Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * reduce the batch size --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

References

#16744 - ggml: fix cuda kernel launch configuration for k_compute_batched_ptrs to support large batch

Author

leejet

Parents

73a48c97

llama.cpp bbac6a26 - ggml: fix cuda kernel launch configuration for k_compute_batched_ptrs to support large batch (#16744)

llama.cpp
bbac6a26 - ggml: fix cuda kernel launch configuration for k_compute_batched_ptrs to support large batch (#16744)