llama.cpp
2c9380dd
- Only one CUDA stream per device for async compute (#1898)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
Only one CUDA stream per device for async compute (#1898)
References
#1898 - CUDA performance optimization: asynchronous computation by using only one cudaStream
Author
JohannesGaessler
Parents
051e1b0e
Loading