llama.cpp
llama : reuse compute graphs
#14482
Merged

Commits
  • llama : reuse compute graphs
    ggerganov committed 275 days ago
  • llama-bench : add graph reuse parameter
    ggerganov committed 275 days ago
  • cont : remove the parameter and the sched resets
    ggerganov committed 275 days ago
  • graph : rename update() to can_reuse()
    ggerganov committed 275 days ago
  • params : remove is_same()
    ggerganov committed 275 days ago
  • graph : set res->params in llm_graph_context constructor
    ggerganov committed 275 days ago
  • graph : avoid set_max_nodes in llm_graph_result
    ggerganov committed 271 days ago
  • kv-cache : reuse llama_context's graph result instance
    ggerganov committed 271 days ago
  • Merge branch 'master' into gg/llama-reuse-graphs
    ggerganov committed 270 days ago
  • context : reset the previous graph result upon memory updates
    ggerganov committed 270 days ago
  • batch : llama_ubatch now carries its data instead of pointing to balloc
    ggerganov committed 270 days ago
  • Merge branch 'master' into gg/llama-reuse-graphs
    ggerganov committed 270 days ago
  • merge : fix build
    ggerganov committed 270 days ago
  • graph : fix can_reuse() checks when flash-attention is disabled
    ggerganov committed 270 days ago
  • graph : move llm_graph_result impl in source file + debug env
    ggerganov committed 270 days ago
Loading