llama.cpp
llama : refactor llama_kv_cache, llama_context and llm_build_context
#11213
Closed
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
95
Changes
View On
GitHub
Commits
llama : add struct llama_kv_cache (wip) [no ci]
ggerganov
committed
1 year ago
llama : cont
ggerganov
committed
1 year ago
kv_cache : functions -> members
ggerganov
committed
1 year ago
kv_cache : fix
ggerganov
committed
1 year ago
kv_cache : minor
ggerganov
committed
1 year ago
context : prepare kv_cache_read/write to be moved to kv_cache
ggerganov
committed
1 year ago
kv_cache : move state read/write to llama_kv_cache
ggerganov
committed
1 year ago
llama : update llama_kv_self API
ggerganov
committed
1 year ago
context : minor
ggerganov
committed
1 year ago
llama : fix names [no ci]
ggerganov
committed
1 year ago
llama : remove references to llama_kv_cache (wip)
ggerganov
committed
1 year ago
cont : move kv_self update to llama_context
ggerganov
committed
1 year ago
context : add get_ctx_padding()
ggerganov
committed
1 year ago
context : move adapter code in the implementation [no ci]
ggerganov
committed
1 year ago
context : initial need_reserve logic
ggerganov
committed
1 year ago
wip
ggerganov
committed
1 year ago
context : introduce llama_batch_manager
ggerganov
committed
1 year ago
context : prepare for abstraction
ggerganov
committed
1 year ago
Merge branch 'master' into gg/llama-kv-cache
ggerganov
committed
1 year ago
llama : resolve rwkv conflict
ggerganov
committed
1 year ago
Merge branch 'master' into gg/llama-kv-cache
ggerganov
committed
1 year ago
Merge branch 'master' into gg/llama-kv-cache
ggerganov
committed
1 year ago
Merge branch 'master' into gg/llama-kv-cache
ggerganov
committed
1 year ago
context : store graph build function callback
ggerganov
committed
1 year ago
Merge branch 'master' into gg/llama-kv-cache
ggerganov
committed
1 year ago
llama : fix rwkv inference (#11618)
MollySophia
committed
1 year ago
llama : clear whitespaces
ggerganov
committed
1 year ago
Merge branch 'master' into gg/llama-kv-cache
ggerganov
committed
1 year ago
kv-cache : fix defrag condition
ggerganov
committed
1 year ago
Merge branch 'master' into gg/llama-kv-cache
ggerganov
committed
1 year ago
llama : dedup reserve code
ggerganov
committed
1 year ago
server : increase context size for the tests
ggerganov
committed
1 year ago
context : add decode/encode
ggerganov
committed
1 year ago
bman : remove ubatch member
ggerganov
committed
1 year ago
context : make output functions members
ggerganov
committed
1 year ago
context : initial abstraction
ggerganov
committed
1 year ago
context : move encode/decode to llama-context.cpp
ggerganov
committed
1 year ago
context : improve llama_context encapsulation
ggerganov
committed
1 year ago
context : minor naming fix
ggerganov
committed
1 year ago
context : move build_rope_factors to base class
ggerganov
committed
1 year ago
context : introduce llama_graph_i
ggerganov
committed
1 year ago
context : prepare llama_model graph build
ggerganov
committed
1 year ago
llama : models now build their graphs using llama_graph_i
ggerganov
committed
1 year ago
graph : restore ubatch in build_cb
ggerganov
committed
1 year ago
context : rename to llama_context_kv_self
ggerganov
committed
1 year ago
llama : introduce llama_io interfaces
ggerganov
committed
1 year ago
context : abstract state read/write
ggerganov
committed
1 year ago
context : minor cleanup
ggerganov
committed
1 year ago
context : move output functionality to base class
ggerganov
committed
1 year ago
context : abstract input
ggerganov
committed
1 year ago
context : abstract constructor and init
ggerganov
committed
1 year ago
context : remove batch_manager
ggerganov
committed
1 year ago
context : move common inputs to base class
ggerganov
committed
1 year ago
graph : update attn/kv_self names
ggerganov
committed
1 year ago
Merge branch 'master' into gg/llama-kv-cache
ggerganov
committed
1 year ago
graph : add llama_graph_result
ggerganov
committed
1 year ago
cont : return important tensors
ggerganov
committed
1 year ago
cont : use returend tensors from the graph build
ggerganov
committed
1 year ago
llama : reorder encode/decode in sources
ggerganov
committed
1 year ago
context : minor simplify
ggerganov
committed
1 year ago
model : pass llama_graph_i as ptr
ggerganov
committed
1 year ago
kv-cache : prepare for abstraction
ggerganov
committed
1 year ago
kv-cache : remove llama_kv_cache_i
ggerganov
committed
1 year ago
context : add llama_context_recurrent
ggerganov
committed
1 year ago
graph : simplify attention api
ggerganov
committed
1 year ago
model : fix order kvq -> qkv
ggerganov
committed
1 year ago
Merge branch 'master' into gg/llama-kv-cache
ggerganov
committed
1 year ago
context : add cache-less llama_context
ggerganov
committed
1 year ago
context : fix causal input for cache-less case
ggerganov
committed
1 year ago
context : add llama_kv_cache_recurrent prototype
ggerganov
committed
1 year ago
context : add save/load for recurrent context
ggerganov
committed
1 year ago
graph : remove worst_case from the API
ggerganov
committed
1 year ago
context : add logs
ggerganov
committed
1 year ago
context : wrap input tensors in struct
ggerganov
committed
1 year ago
context : fix n_outputs init
ggerganov
committed
1 year ago
Merge branch 'master' into gg/llama-kv-cache
ggerganov
committed
1 year ago
wip enc-dec
ggerganov
committed
1 year ago
cont : enc should work now, next is dec
ggerganov
committed
1 year ago
graph : remove the build_kv_... API from llama_graph_i
ggerganov
committed
1 year ago
context : remove redundant virtual, protected -> private
ggerganov
committed
1 year ago
context : fix recurrent reserve
ggerganov
committed
1 year ago
context : reuse built_attn_mha
ggerganov
committed
1 year ago
context : explicit llama_context_i abstract interface
ggerganov
committed
1 year ago
enc-dec : compose wip
ggerganov
committed
1 year ago
context : enc-dec is now working
ggerganov
committed
1 year ago
context : fix enc-dec state save/load
ggerganov
committed
1 year ago
context : pass embeddings tensor from encoder to decoder
ggerganov
committed
1 year ago
context : disable encoder embd tensor for now
ggerganov
committed
1 year ago
Merge branch 'master' into gg/llama-kv-cache
ggerganov
committed
1 year ago
kv-cache : basic abstraction
ggerganov
committed
1 year ago
llama : introduce concept of llama_memory
ggerganov
committed
1 year ago
context : decouple inputs, llama_graph_i become const (WIP)
ggerganov
committed
1 year ago
cont : migrate the rest of the inputs out of llama_context
ggerganov
committed
1 year ago
graph : move non-context related logic to llm_build_context
ggerganov
committed
1 year ago
graph : add comments
ggerganov
committed
1 year ago
Loading