llama.cpp
4b5f3cd6
- parallel : process system prompt once + configurable paramters + llama API
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Hide Minimap (CTRL+M)
Commit
1 year ago
parallel : process system prompt once + configurable paramters + llama API
References
#3228 - llama : custom attention mask + parallel decoding + no context swaps
Author
ggerganov
Parents
82e20e9b
Files
9
common
common.cpp
common.h
examples
llama-bench
llama-bench.cpp
main
main.cpp
parallel
parallel.cpp
perplexity
perplexity.cpp
speculative
speculative.cpp
llama.cpp
llama.h
Loading