llama.cpp
ebd062bc
- cuda : use 512 threads for soft_max instead of 32
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
cuda : use 512 threads for soft_max instead of 32
References
#4256 - ggml : add ggml_soft_max_ext
Author
ggerganov
Committer
ggerganov
Parents
580fe206
Loading