llama.cpp
llama : refactor llama_kv_cache, llama_context and llm_build_context
#11213
Closed

Commits
  • llama : add struct llama_kv_cache (wip) [no ci]
    ggerganov committed 1 year ago
  • llama : cont
    ggerganov committed 1 year ago
  • kv_cache : functions -> members
    ggerganov committed 1 year ago
  • kv_cache : fix
    ggerganov committed 1 year ago
  • kv_cache : minor
    ggerganov committed 1 year ago
  • context : prepare kv_cache_read/write to be moved to kv_cache
    ggerganov committed 1 year ago
  • kv_cache : move state read/write to llama_kv_cache
    ggerganov committed 1 year ago
  • llama : update llama_kv_self API
    ggerganov committed 1 year ago
  • context : minor
    ggerganov committed 1 year ago
  • llama : fix names [no ci]
    ggerganov committed 1 year ago
  • llama : remove references to llama_kv_cache (wip)
    ggerganov committed 1 year ago
  • cont : move kv_self update to llama_context
    ggerganov committed 1 year ago
  • context : add get_ctx_padding()
    ggerganov committed 1 year ago
  • context : move adapter code in the implementation [no ci]
    ggerganov committed 1 year ago
  • context : initial need_reserve logic
    ggerganov committed 1 year ago
  • wip
    ggerganov committed 1 year ago
  • context : introduce llama_batch_manager
    ggerganov committed 1 year ago
  • context : prepare for abstraction
    ggerganov committed 1 year ago
  • Merge branch 'master' into gg/llama-kv-cache
    ggerganov committed 1 year ago
  • llama : resolve rwkv conflict
    ggerganov committed 1 year ago
  • Merge branch 'master' into gg/llama-kv-cache
    ggerganov committed 1 year ago
  • Merge branch 'master' into gg/llama-kv-cache
    ggerganov committed 1 year ago
  • Merge branch 'master' into gg/llama-kv-cache
    ggerganov committed 1 year ago
  • context : store graph build function callback
    ggerganov committed 1 year ago
  • Merge branch 'master' into gg/llama-kv-cache
    ggerganov committed 1 year ago
  • llama : fix rwkv inference (#11618)
    MollySophia committed 1 year ago
  • llama : clear whitespaces
    ggerganov committed 1 year ago
  • Merge branch 'master' into gg/llama-kv-cache
    ggerganov committed 1 year ago
  • kv-cache : fix defrag condition
    ggerganov committed 1 year ago
  • Merge branch 'master' into gg/llama-kv-cache
    ggerganov committed 1 year ago
  • llama : dedup reserve code
    ggerganov committed 1 year ago
  • server : increase context size for the tests
    ggerganov committed 1 year ago
  • context : add decode/encode
    ggerganov committed 1 year ago
  • bman : remove ubatch member
    ggerganov committed 1 year ago
  • context : make output functions members
    ggerganov committed 1 year ago
  • context : initial abstraction
    ggerganov committed 1 year ago
  • context : move encode/decode to llama-context.cpp
    ggerganov committed 1 year ago
  • context : improve llama_context encapsulation
    ggerganov committed 1 year ago
  • context : minor naming fix
    ggerganov committed 1 year ago
  • context : move build_rope_factors to base class
    ggerganov committed 1 year ago
  • context : introduce llama_graph_i
    ggerganov committed 1 year ago
  • context : prepare llama_model graph build
    ggerganov committed 1 year ago
  • llama : models now build their graphs using llama_graph_i
    ggerganov committed 1 year ago
  • graph : restore ubatch in build_cb
    ggerganov committed 1 year ago
  • context : rename to llama_context_kv_self
    ggerganov committed 1 year ago
  • llama : introduce llama_io interfaces
    ggerganov committed 1 year ago
  • context : abstract state read/write
    ggerganov committed 1 year ago
  • context : minor cleanup
    ggerganov committed 1 year ago
  • context : move output functionality to base class
    ggerganov committed 1 year ago
  • context : abstract input
    ggerganov committed 1 year ago
  • context : abstract constructor and init
    ggerganov committed 1 year ago
  • context : remove batch_manager
    ggerganov committed 1 year ago
  • context : move common inputs to base class
    ggerganov committed 1 year ago
  • graph : update attn/kv_self names
    ggerganov committed 1 year ago
  • Merge branch 'master' into gg/llama-kv-cache
    ggerganov committed 1 year ago
  • graph : add llama_graph_result
    ggerganov committed 1 year ago
  • cont : return important tensors
    ggerganov committed 1 year ago
  • cont : use returend tensors from the graph build
    ggerganov committed 1 year ago
  • llama : reorder encode/decode in sources
    ggerganov committed 1 year ago
  • context : minor simplify
    ggerganov committed 1 year ago
  • model : pass llama_graph_i as ptr
    ggerganov committed 1 year ago
  • kv-cache : prepare for abstraction
    ggerganov committed 1 year ago
  • kv-cache : remove llama_kv_cache_i
    ggerganov committed 1 year ago
  • context : add llama_context_recurrent
    ggerganov committed 1 year ago
  • graph : simplify attention api
    ggerganov committed 1 year ago
  • model : fix order kvq -> qkv
    ggerganov committed 1 year ago
  • Merge branch 'master' into gg/llama-kv-cache
    ggerganov committed 1 year ago
  • context : add cache-less llama_context
    ggerganov committed 1 year ago
  • context : fix causal input for cache-less case
    ggerganov committed 1 year ago
  • context : add llama_kv_cache_recurrent prototype
    ggerganov committed 1 year ago
  • context : add save/load for recurrent context
    ggerganov committed 1 year ago
  • graph : remove worst_case from the API
    ggerganov committed 1 year ago
  • context : add logs
    ggerganov committed 1 year ago
  • context : wrap input tensors in struct
    ggerganov committed 1 year ago
  • context : fix n_outputs init
    ggerganov committed 1 year ago
  • Merge branch 'master' into gg/llama-kv-cache
    ggerganov committed 1 year ago
  • wip enc-dec
    ggerganov committed 1 year ago
  • cont : enc should work now, next is dec
    ggerganov committed 1 year ago
  • graph : remove the build_kv_... API from llama_graph_i
    ggerganov committed 1 year ago
  • context : remove redundant virtual, protected -> private
    ggerganov committed 1 year ago
  • context : fix recurrent reserve
    ggerganov committed 1 year ago
  • context : reuse built_attn_mha
    ggerganov committed 1 year ago
  • context : explicit llama_context_i abstract interface
    ggerganov committed 1 year ago
  • enc-dec : compose wip
    ggerganov committed 1 year ago
  • context : enc-dec is now working
    ggerganov committed 1 year ago
  • context : fix enc-dec state save/load
    ggerganov committed 1 year ago
  • context : pass embeddings tensor from encoder to decoder
    ggerganov committed 1 year ago
  • context : disable encoder embd tensor for now
    ggerganov committed 1 year ago
  • Merge branch 'master' into gg/llama-kv-cache
    ggerganov committed 1 year ago
  • kv-cache : basic abstraction
    ggerganov committed 1 year ago
  • llama : introduce concept of llama_memory
    ggerganov committed 1 year ago
  • context : decouple inputs, llama_graph_i become const (WIP)
    ggerganov committed 1 year ago
  • cont : migrate the rest of the inputs out of llama_context
    ggerganov committed 1 year ago
  • graph : move non-context related logic to llm_build_context
    ggerganov committed 1 year ago
  • graph : add comments
    ggerganov committed 1 year ago
Loading