llama.cpp
3d297c1a
- cuda : add cublasGemmStridedBatchedEx for non-broadcasted cases
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
cuda : add cublasGemmStridedBatchedEx for non-broadcasted cases
References
#3749 - cuda : add batched cuBLAS GEMM for faster attention
Author
ggerganov
Committer
ggerganov
Parents
d4156690
Loading