whisper.cpp
2d70cd36
- CUDA: optimize FA for GQA + large batches (llama/12014)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
334 days ago
CUDA: optimize FA for GQA + large batches (llama/12014)
References
#2844 - sync : ggml
Author
JohannesGaessler
Committer
ggerganov
Parents
98dab49b
Loading