llama.cpp
15d2b46b - rpc : cache and reuse compute graphs (#15405)

Commit
16 days ago
rpc : cache and reuse compute graphs (#15405) Store the last computed graph and reuse it when possible. Also do not return response from GRAPH_COMPUTE and assume it always completes successfully. If this this is not the case, the server closes the connection. This saves us a network round trip to the server.
Author
Parents
Loading