llama.cpp
kv-cache : refactor + add llama_memory_state_i
#13746
Merged

kv-cache : refactor + add llama_memory_state_i #13746

ggerganov merged 14 commits into master from gg/kv-cache-simplify-part3
ggerganov
github-actions github-actions added examples
github-actions github-actions added server
ggerganov ggerganov force pushed to 8323e238 1 year ago
Base automatically changed from gg/kv-cache-simplify-part2 to master 1 year ago
ggerganov ggerganov force pushed to 1eec34ad 1 year ago
ggerganov ggerganov marked this pull request as ready for review 1 year ago
ggerganov ggerganov requested a review from ngxson ngxson 1 year ago
ggerganov
ggerganov ggerganov requested a review from slaren slaren 1 year ago
ggerganov
ggerganov commented on 2025-05-25
ggerganov
ggerganov commented on 2025-05-25
ngxson
ggerganov
ngxson
rhvall
ngxson
ggerganov
ngxson
ggerganov ggerganov force pushed to 0b73da5a 1 year ago
slaren
ggerganov
slaren
ggerganov
ggerganov ggerganov force pushed from 0b73da5a to 2252eefd 1 year ago
ggerganov ggerganov marked this pull request as draft 1 year ago
ggerganov
gabe-l-hart
gabe-l-hart commented on 2025-05-27
gabe-l-hart
gabe-l-hart commented on 2025-05-27
ggerganov ggerganov force pushed 1 year ago
ggerganov ggerganov force pushed 1 year ago
ggerganov ggerganov force pushed to a3ebf0aa 1 year ago
ggerganov
slaren
slaren commented on 2025-05-28
ggerganov
ggerganov ggerganov force pushed to a592c137 1 year ago
slaren
ggerganov
gabe-l-hart
gabe-l-hart commented on 2025-05-28
gabe-l-hart
gabe-l-hart commented on 2025-05-28
ggerganov ggerganov force pushed 1 year ago
ggerganov ggerganov force pushed to eed741e9 1 year ago
ggerganov
slaren
slaren approved these changes on 2025-05-29
ggerganov ggerganov force pushed from 9548d2a1 1 year ago
ggerganov ggerganov force pushed 1 year ago
ggerganov ggerganov force pushed to 2b984f41 1 year ago
ggerganov ggerganov marked this pull request as ready for review 1 year ago
ggerganov
ggerganov kv-cache : simplify the "struct llama_kv_cache" interface
773b6e39
ggerganov kv-cache : revert the (n_swa + n_ubatch) change (for next PR)
9fc50dcd
ggerganov kv-cache : some comments
c2c35917
ggerganov context : fix graph reserve for multiple sequences
88567820
ggerganov kv-cache : fix typo [no ci]
bffb9d4a
ggerganov kv-cache : fix find_slot() logic for free slots
32cc9eab
ggerganov llama : add TODO for deprecating the defrag API in the future
f97de9b7
ggerganov kv-cache : improve find_slot() using min/max seq pos info
7764d914
ggerganov llama : handle aborts and compute errors
780bba94
ggerganov memory : extract state into llama_memory_state
dbcfa5f1
ggerganov kv-cache : add comments
f2ded9d4
ggerganov server : update batching logic to reset n_batch on successful decode
e230e514
ggerganov server : upon full re-processing, remove the sequence from the cache
3cf51863
ggerganov kv-cache : add TODO for doing split_equal when split_simple fails
71619f2d
ggerganov ggerganov force pushed from f23e4cca to 71619f2d 1 year ago
ggerganov ggerganov changed the title kv-cache : simplify kv-cache : refactor + add llama_memory_state_i 1 year ago
ggerganov ggerganov merged 12d0188c into master 1 year ago
ggerganov ggerganov deleted the gg/kv-cache-simplify-part3 branch 1 year ago
gabe-l-hart
ggerganov
gabe-l-hart

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone