llama.cpp
llama: (proposal) propagating the results of `graph_compute` to the user interface
#9525

Merged

Commits

llama: propagating the results of `graph_compute` to the user interface

Xarbirus committed 1 year ago
llama: reverting kv_cache in case of failed compute

Xarbirus committed 1 year ago
llama: `llama_kv_cache_state` was removed, only the result of `llama_graph_compute` is returned

Xarbirus committed 1 year ago
llama: restore a kv_cache in case of failed computation

Xarbirus committed 1 year ago
llama: correct reverting of the entire batch.

Xarbirus committed 1 year ago
llama: updated comments

Xarbirus committed 1 year ago
llama : add comments about KV cache state after error

ggerganov committed 1 year ago