llama.cpp
ded9b43c
- parallel : fix cases where the input prompts can overflow the batch
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Hide Minimap (CTRL+M)
Commit
1 year ago
parallel : fix cases where the input prompts can overflow the batch
References
#3228 - llama : custom attention mask + parallel decoding + no context swaps
Author
ggerganov
Parents
ee1d670c
Files
1
examples/parallel
parallel.cpp
Loading