llama.cpp
kv-cache : refactor + add llama_memory_state_i
#13746
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
14
Changes
View On
GitHub
kv-cache : refactor + add llama_memory_state_i
#13746
ggerganov
merged 14 commits into
master
from
gg/kv-cache-simplify-part3
github-actions
added
examples
github-actions
added
server
ggerganov
force pushed
to
8323e238
1 year ago
Base automatically changed from
gg/kv-cache-simplify-part2
to
master
1 year ago
ggerganov
force pushed
to
1eec34ad
1 year ago
ggerganov
marked this pull request as ready for review
1 year ago
ggerganov
requested a review
from
ngxson
1 year ago
ggerganov
requested a review
from
slaren
1 year ago
ggerganov
commented on 2025-05-25
ggerganov
commented on 2025-05-25
ggerganov
force pushed
to
0b73da5a
1 year ago
ggerganov
force pushed
from
0b73da5a
to
2252eefd
1 year ago
ggerganov
marked this pull request as draft
1 year ago
gabe-l-hart
commented on 2025-05-27
gabe-l-hart
commented on 2025-05-27
ggerganov
force pushed
1 year ago
ggerganov
force pushed
1 year ago
ggerganov
force pushed
to
a3ebf0aa
1 year ago
slaren
commented on 2025-05-28
ggerganov
force pushed
to
a592c137
1 year ago
gabe-l-hart
commented on 2025-05-28
gabe-l-hart
commented on 2025-05-28
ggerganov
force pushed
1 year ago
ggerganov
force pushed
to
eed741e9
1 year ago
slaren
approved these changes on 2025-05-29
ggerganov
force pushed
from
9548d2a1
1 year ago
ggerganov
force pushed
1 year ago
ggerganov
force pushed
to
2b984f41
1 year ago
ggerganov
marked this pull request as ready for review
1 year ago
kv-cache : simplify the "struct llama_kv_cache" interface
773b6e39
kv-cache : revert the (n_swa + n_ubatch) change (for next PR)
9fc50dcd
kv-cache : some comments
c2c35917
context : fix graph reserve for multiple sequences
88567820
kv-cache : fix typo [no ci]
bffb9d4a
kv-cache : fix find_slot() logic for free slots
32cc9eab
llama : add TODO for deprecating the defrag API in the future
f97de9b7
kv-cache : improve find_slot() using min/max seq pos info
7764d914
llama : handle aborts and compute errors
780bba94
memory : extract state into llama_memory_state
dbcfa5f1
kv-cache : add comments
f2ded9d4
server : update batching logic to reset n_batch on successful decode
e230e514
server : upon full re-processing, remove the sequence from the cache
3cf51863
kv-cache : add TODO for doing split_equal when split_simple fails
71619f2d
ggerganov
force pushed
from
f23e4cca
to
71619f2d
1 year ago
ggerganov
changed the title
kv-cache : simplify
kv-cache : refactor + add llama_memory_state_i
1 year ago
ggerganov
merged
12d0188c
into master
1 year ago
ggerganov
deleted the gg/kv-cache-simplify-part3 branch
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
slaren
gabe-l-hart
compilade
ngxson
Assignees
No one assigned
Labels
examples
server
Milestone
No milestone
Login to write a write a comment.
Login via GitHub