llama.cpp
8f900abf
- CUDA: faster softmax via shared memory + fp16 math (#4742)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
CUDA: faster softmax via shared memory + fp16 math (#4742)
References
#4742 - CUDA: faster softmax via shared memory + fp16 math
Author
JohannesGaessler
Parents
1fc2f265
Loading