llama.cpp
addae65f
- llama : improve llama_batch API + simplify parallel example
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
llama : improve llama_batch API + simplify parallel example
References
#3228 - llama : custom attention mask + parallel decoding + no context swaps
Author
ggerganov
Committer
ggerganov
Parents
a1327c71
Loading