llama.cpp
d798a17c
- cuda : add TODO for calling cublas from kernel + using mem pool
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
cuda : add TODO for calling cublas from kernel + using mem pool
References
cuda-batched-gemm
#3749 - cuda : add batched cuBLAS GEMM for faster attention
Author
ggerganov
Parents
27c34c01
Loading