llama.cpp
llama : refactor llama_context, llama_kv_cache, llm_build_context (v2)
#12181
Merged

llama : refactor llama_context, llama_kv_cache, llm_build_context (v2) #12181

ggerganov merged 16 commits into master from gg/llama-kv-cache-v2
ggerganov
github-actions github-actions added android
github-actions github-actions added examples
github-actions github-actions added python
github-actions github-actions added server
ggerganov ggerganov force pushed to 900f2faa 287 days ago
ggerganov ggerganov force pushed to 5bb8a26c 286 days ago
ggerganov ggerganov force pushed to 250f398b 286 days ago
ggerganov ggerganov force pushed 285 days ago
ggerganov ggerganov force pushed to 905164fb 285 days ago
ggerganov ggerganov force pushed 284 days ago
ggerganov ggerganov force pushed to 62ba774b 284 days ago
ggerganov ggerganov marked this pull request as ready for review 284 days ago
ggerganov ggerganov requested a review from ngxson ngxson 284 days ago
ggerganov
slaren
slaren approved these changes on 2025-03-11
ggerganov ggerganov force pushed from 62ba774b 280 days ago
ggerganov llama : refactor llama_context, llama_kv_cache, llm_build_context
55909257
ggerganov graph : don't mutate the KV cache during defrag
75624a20
ggerganov context : reduce virtuals + remove test function
5aa3518d
ggerganov context : move interface implementation to source file + factory
0a6648ca
ggerganov graph : move KV cache build functions to llama_context impl
cc9fa25a
ggerganov graph : remove model reference from build_pooling
29c9ef56
ggerganov graph : remove llama_model reference
bc825604
ggerganov kv_cache : provide rope factors
ff95ffdf
ggerganov graph : rework inputs to use only unique_ptr, remove attn input abstr…
562a4787
ggerganov context : remove llama_context_i abstraction
d0cb3196
ggerganov context : clean-up
a4fc4e8e
ggerganov graph : clean-up
af9f6b8e
ggerganov llama : remove redundant keywords (struct, enum)
226ff010
ggerganov model : adapt gemma3
5fc6dbd9
ggerganov ggerganov force pushed to 5fc6dbd9 279 days ago
ggerganov graph : restore same attention ops as on master
70ef6530
ggerganov
ggerganov
ggerganov llama : remove TODO + fix indent
31b8eab5
ggerganov ggerganov merged e0dbec0b into master 278 days ago
ggerganov ggerganov deleted the gg/llama-kv-cache-v2 branch 278 days ago
ngxson
ggerganov
giladgd
ggerganov
giladgd
ggerganov
fairydreaming
ggerganov
fairydreaming
ggerganov
fairydreaming
ggerganov
fairydreaming
giladgd
ggerganov
giladgd
okaris
ggerganov
okaris
ggerganov

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone