llama.cpp
f7625019 - server : fix crash when system prompt is bigger than batch size (#5714)

Commit
1 year ago
server : fix crash when system prompt is bigger than batch size (#5714) The system prompt is now decoded in batches. * server : fix off-by-one n_past when start of prompt matches whole cache The tokens right after the matching part would otherwise skip a pos value.
Author
Parents
Loading