llama.cpp
kv-cache : refactor + add llama_memory_state_i
#13746
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
14
Changes
View On
GitHub
kv-cache : refactor + add llama_memory_state_i
#13746
ggerganov
merged 14 commits into
master
from
gg/kv-cache-simplify-part3
github-actions
added
examples
github-actions
added
server
ggerganov
force pushed
from
d23f8875
to
8323e238
106 days ago
Base automatically changed from
gg/kv-cache-simplify-part2
to
master
105 days ago
ggerganov
force pushed
from
c1434b81
to
1eec34ad
105 days ago
ggerganov
marked this pull request as ready for review
105 days ago
ggerganov
requested a review
from
ngxson
105 days ago
ggerganov
requested a review
from
slaren
105 days ago
ggerganov
commented on 2025-05-25
ggerganov
commented on 2025-05-25
ggerganov
force pushed
from
3ef770f9
to
0b73da5a
104 days ago
ggerganov
force pushed
from
0b73da5a
to
2252eefd
103 days ago
ggerganov
marked this pull request as draft
103 days ago
gabe-l-hart
commented on 2025-05-27
gabe-l-hart
commented on 2025-05-27
ggerganov
force pushed
from
ab2e2758
to
be635a71
102 days ago
ggerganov
force pushed
from
be635a71
to
7dc61c2d
102 days ago
ggerganov
force pushed
from
7dc61c2d
to
a3ebf0aa
102 days ago
slaren
commented on 2025-05-28
ggerganov
force pushed
from
19424279
to
a592c137
101 days ago
gabe-l-hart
commented on 2025-05-28
gabe-l-hart
commented on 2025-05-28
ggerganov
force pushed
from
37cec432
to
825efad5
101 days ago
ggerganov
force pushed
from
825efad5
to
eed741e9
101 days ago
slaren
approved these changes on 2025-05-29
ggerganov
force pushed
from
9548d2a1
to
256f1b70
100 days ago
ggerganov
force pushed
from
256f1b70
to
9d053810
100 days ago
ggerganov
force pushed
from
9d053810
to
2b984f41
100 days ago
ggerganov
marked this pull request as ready for review
100 days ago
kv-cache : simplify the "struct llama_kv_cache" interface
773b6e39
kv-cache : revert the (n_swa + n_ubatch) change (for next PR)
9fc50dcd
kv-cache : some comments
c2c35917
context : fix graph reserve for multiple sequences
88567820
kv-cache : fix typo [no ci]
bffb9d4a
kv-cache : fix find_slot() logic for free slots
32cc9eab
llama : add TODO for deprecating the defrag API in the future
f97de9b7
kv-cache : improve find_slot() using min/max seq pos info
7764d914
llama : handle aborts and compute errors
780bba94
memory : extract state into llama_memory_state
dbcfa5f1
kv-cache : add comments
f2ded9d4
server : update batching logic to reset n_batch on successful decode
e230e514
server : upon full re-processing, remove the sequence from the cache
3cf51863
kv-cache : add TODO for doing split_equal when split_simple fails
71619f2d
ggerganov
force pushed
from
f23e4cca
to
71619f2d
99 days ago
ggerganov
changed the title
kv-cache : simplify
kv-cache : refactor + add llama_memory_state_i
99 days ago
ggerganov
merged
12d0188c
into master
99 days ago
ggerganov
deleted the gg/kv-cache-simplify-part3 branch
99 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
slaren
gabe-l-hart
compilade
ngxson
Assignees
No one assigned
Labels
examples
server
Milestone
No milestone
Login to write a write a comment.
Login via GitHub