llama.cpp
kv-cache : refactor + add llama_memory_state_i
#13746
Merged

kv-cache : refactor + add llama_memory_state_i #13746

ggerganov merged 14 commits into master from gg/kv-cache-simplify-part3
ggerganov
github-actions github-actions added examples
github-actions github-actions added server
ggerganov ggerganov force pushed from d23f8875 to 8323e238 106 days ago
Base automatically changed from gg/kv-cache-simplify-part2 to master 105 days ago
ggerganov ggerganov force pushed from c1434b81 to 1eec34ad 105 days ago
ggerganov ggerganov marked this pull request as ready for review 105 days ago
ggerganov ggerganov requested a review from ngxson ngxson 105 days ago
ggerganov
ggerganov ggerganov requested a review from slaren slaren 105 days ago
ggerganov
ggerganov commented on 2025-05-25
ggerganov
ggerganov commented on 2025-05-25
ngxson
ggerganov
ngxson
rhvall
ngxson
ggerganov
ngxson
ggerganov ggerganov force pushed from 3ef770f9 to 0b73da5a 104 days ago
slaren
ggerganov
slaren
ggerganov
ggerganov ggerganov force pushed from 0b73da5a to 2252eefd 103 days ago
ggerganov ggerganov marked this pull request as draft 103 days ago
ggerganov
gabe-l-hart
gabe-l-hart commented on 2025-05-27
gabe-l-hart
gabe-l-hart commented on 2025-05-27
ggerganov ggerganov force pushed from ab2e2758 to be635a71 102 days ago
ggerganov ggerganov force pushed from be635a71 to 7dc61c2d 102 days ago
ggerganov ggerganov force pushed from 7dc61c2d to a3ebf0aa 102 days ago
ggerganov
slaren
slaren commented on 2025-05-28
ggerganov
ggerganov ggerganov force pushed from 19424279 to a592c137 101 days ago
slaren
ggerganov
gabe-l-hart
gabe-l-hart commented on 2025-05-28
gabe-l-hart
gabe-l-hart commented on 2025-05-28
ggerganov ggerganov force pushed from 37cec432 to 825efad5 101 days ago
ggerganov ggerganov force pushed from 825efad5 to eed741e9 101 days ago
ggerganov
slaren
slaren approved these changes on 2025-05-29
ggerganov ggerganov force pushed from 9548d2a1 to 256f1b70 100 days ago
ggerganov ggerganov force pushed from 256f1b70 to 9d053810 100 days ago
ggerganov ggerganov force pushed from 9d053810 to 2b984f41 100 days ago
ggerganov ggerganov marked this pull request as ready for review 100 days ago
ggerganov
ggerganov kv-cache : simplify the "struct llama_kv_cache" interface
773b6e39
ggerganov kv-cache : revert the (n_swa + n_ubatch) change (for next PR)
9fc50dcd
ggerganov kv-cache : some comments
c2c35917
ggerganov context : fix graph reserve for multiple sequences
88567820
ggerganov kv-cache : fix typo [no ci]
bffb9d4a
ggerganov kv-cache : fix find_slot() logic for free slots
32cc9eab
ggerganov llama : add TODO for deprecating the defrag API in the future
f97de9b7
ggerganov kv-cache : improve find_slot() using min/max seq pos info
7764d914
ggerganov llama : handle aborts and compute errors
780bba94
ggerganov memory : extract state into llama_memory_state
dbcfa5f1
ggerganov kv-cache : add comments
f2ded9d4
ggerganov server : update batching logic to reset n_batch on successful decode
e230e514
ggerganov server : upon full re-processing, remove the sequence from the cache
3cf51863
ggerganov kv-cache : add TODO for doing split_equal when split_simple fails
71619f2d
ggerganov ggerganov force pushed from f23e4cca to 71619f2d 99 days ago
ggerganov ggerganov changed the title kv-cache : simplify kv-cache : refactor + add llama_memory_state_i 99 days ago
ggerganov ggerganov merged 12d0188c into master 99 days ago
ggerganov ggerganov deleted the gg/kv-cache-simplify-part3 branch 99 days ago
gabe-l-hart
ggerganov
gabe-l-hart

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone