llama.cpp
llama : refactor llama_kv_cache, llama_context and llm_build_context
#11213
Closed
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
95
Changes
View On
GitHub
llama : refactor llama_kv_cache, llama_context and llm_build_context
#11213
ggerganov
wants to merge 95 commits into
master
from
gg/llama-kv-cache
github-actions
added
examples
github-actions
added
server
ggerganov
force pushed
1 year ago
ggerganov
force pushed
1 year ago
ggerganov
force pushed
to
fb740247
1 year ago
github-actions
added
android
ggerganov
force pushed
to
9027f329
1 year ago
ggerganov
commented on 2025-01-16
slaren
commented on 2025-01-16
ggerganov
commented on 2025-01-16
ggerganov
marked this pull request as ready for review
1 year ago
ggerganov
requested a review
from
ngxson
1 year ago
ggerganov
changed the title
llama : add struct llama_kv_cache
llama : refactor llama_kv_cache, llama_context and llm_build_context
1 year ago
ggerganov
force pushed
from
60106c62
1 year ago
ggerganov
marked this pull request as draft
1 year ago
ggerganov
force pushed
to
a47d389c
1 year ago
llama : add struct llama_kv_cache (wip) [no ci]
f78b396e
llama : cont
e4550fba
kv_cache : functions -> members
4d7bd03e
kv_cache : fix
fef90cb3
kv_cache : minor
73a14ecc
context : prepare kv_cache_read/write to be moved to kv_cache
4cd1b6fa
kv_cache : move state read/write to llama_kv_cache
fd05ab87
llama : update llama_kv_self API
17b363af
context : minor
a19f671f
llama : fix names [no ci]
ae274f97
llama : remove references to llama_kv_cache (wip)
f2524c0e
cont : move kv_self update to llama_context
b4ec1d44
context : add get_ctx_padding()
f0713498
context : move adapter code in the implementation [no ci]
c75ba685
context : initial need_reserve logic
133ad6a7
wip
cb8f2095
context : introduce llama_batch_manager
99422dfa
context : prepare for abstraction
a0c500b4
ggerganov
force pushed
from
a47d389c
to
a0c500b4
1 year ago
Merge branch 'master' into gg/llama-kv-cache
e665b57f
llama : resolve rwkv conflict
91888569
Merge branch 'master' into gg/llama-kv-cache
c30e34cd
Merge branch 'master' into gg/llama-kv-cache
a40ba49f
MollySophia
requested a review
from
MollySophia
1 year ago
Merge branch 'master' into gg/llama-kv-cache
5d3491e7
context : store graph build function callback
3e23be79
ggerganov
force pushed
to
3e23be79
1 year ago
Merge branch 'master' into gg/llama-kv-cache
74b08072
llama : fix rwkv inference (#11618)
1eca8916
llama : clear whitespaces
e0d913fc
Merge branch 'master' into gg/llama-kv-cache
0f1c1cab
kv-cache : fix defrag condition
b15fede7
ggerganov
force pushed
to
b15fede7
1 year ago
Merge branch 'master' into gg/llama-kv-cache
972f91c7
llama : dedup reserve code
f9971ef2
server : increase context size for the tests
879ba827
github-actions
added
python
context : add decode/encode
ef358ee7
ggerganov
force pushed
to
ef358ee7
1 year ago
bman : remove ubatch member
d1d8d530
context : make output functions members
2cd8a903
context : initial abstraction
02ef4be9
ggerganov
force pushed
to
02ef4be9
1 year ago
context : move encode/decode to llama-context.cpp
b52b79b0
context : improve llama_context encapsulation
8da7f612
ggerganov
force pushed
to
8da7f612
1 year ago
context : minor naming fix
d146a14f
context : move build_rope_factors to base class
5eae8e51
context : introduce llama_graph_i
e633dc17
context : prepare llama_model graph build
0ab50f1b
ggerganov
force pushed
to
0ab50f1b
1 year ago
llama : models now build their graphs using llama_graph_i
f63aeecc
graph : restore ubatch in build_cb
6ee86e5e
context : rename to llama_context_kv_self
fbe6a072
llama : introduce llama_io interfaces
3a504d9a
ggerganov
force pushed
to
3a504d9a
1 year ago
context : abstract state read/write
f7c7757b
context : minor cleanup
e08f38df
context : move output functionality to base class
107d1e2c
context : abstract input
ed3cb55a
context : abstract constructor and init
131743ff
ggerganov
force pushed
to
131743ff
1 year ago
context : remove batch_manager
d5e8e1a2
context : move common inputs to base class
82806456
graph : update attn/kv_self names
1d801d27
Merge branch 'master' into gg/llama-kv-cache
f0d3ff23
graph : add llama_graph_result
c2359031
cont : return important tensors
172f6169
ggerganov
force pushed
to
172f6169
1 year ago
cont : use returend tensors from the graph build
bc6f187e
llama : reorder encode/decode in sources
befe14f0
context : minor simplify
9e50456e
model : pass llama_graph_i as ptr
2bffc2d5
kv-cache : prepare for abstraction
f5cedbca
ggerganov
force pushed
to
f5cedbca
1 year ago
kv-cache : remove llama_kv_cache_i
5f11a550
ggerganov
force pushed
1 year ago
ggerganov
force pushed
1 year ago
ggerganov
force pushed
1 year ago
context : add llama_context_recurrent
e17e4b72
ggerganov
force pushed
to
e17e4b72
1 year ago
graph : simplify attention api
2eacb4c1
model : fix order kvq -> qkv
f95b04a2
Merge branch 'master' into gg/llama-kv-cache
072280ea
ggerganov
force pushed
1 year ago
ggerganov
force pushed
1 year ago
context : add cache-less llama_context
b1554be1
ggerganov
force pushed
to
b1554be1
1 year ago
context : fix causal input for cache-less case
ad870c49
ggerganov
force pushed
to
ad870c49
1 year ago
context : add llama_kv_cache_recurrent prototype
08011c2c
context : add save/load for recurrent context
2645a7d9
graph : remove worst_case from the API
548c230d
context : add logs
ebf1bdf9
context : wrap input tensors in struct
f588a70d
context : fix n_outputs init
3753b30d
Merge branch 'master' into gg/llama-kv-cache
c4c0a4d1
wip enc-dec
f5e80208
ngxson
commented on 2025-02-22
cont : enc should work now, next is dec
372fa3a8
graph : remove the build_kv_... API from llama_graph_i
6378112c
context : remove redundant virtual, protected -> private
0699a44c
context : fix recurrent reserve
a5a85a3b
context : reuse built_attn_mha
4a1054b5
context : explicit llama_context_i abstract interface
9cd78f11
enc-dec : compose wip
be58e300
context : enc-dec is now working
e5bc5f8e
context : fix enc-dec state save/load
e2b3294f
context : pass embeddings tensor from encoder to decoder
4efe9898
ggerganov
commented on 2025-02-25
context : disable encoder embd tensor for now
952feedf
Merge branch 'master' into gg/llama-kv-cache
82675a01
kv-cache : basic abstraction
828effd9
ggerganov
force pushed
to
828effd9
1 year ago
llama : introduce concept of llama_memory
38db8a58
context : decouple inputs, llama_graph_i become const (WIP)
7f02ee56
ggerganov
force pushed
to
7f02ee56
1 year ago
cont : migrate the rest of the inputs out of llama_context
9cab53c7
graph : move non-context related logic to llm_build_context
0f7daa9d
graph : add comments
624f7bd0
ggerganov
closed this
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
slaren
ngxson
MollySophia
Assignees
No one assigned
Labels
android
examples
python
server
Milestone
No milestone
Login to write a write a comment.
Login via GitHub