llama.cpp
llama : refactor llama_context, llama_kv_cache, llm_build_context (v2)
#12181
Merged

llama : refactor llama_context, llama_kv_cache, llm_build_context (v2) #12181

ggerganov merged 16 commits into master from gg/llama-kv-cache-v2
ggerganov
github-actions github-actions added android
github-actions github-actions added examples
github-actions github-actions added python
github-actions github-actions added server
ggerganov ggerganov force pushed from 1bdfacc9 to 900f2faa 187 days ago
ggerganov ggerganov force pushed from 564747be to 5bb8a26c 187 days ago
ggerganov ggerganov force pushed from 48632552 to 250f398b 187 days ago
ggerganov ggerganov force pushed from 25a58486 to 72a46666 186 days ago
ggerganov ggerganov force pushed from 72a46666 to 905164fb 186 days ago
ggerganov ggerganov force pushed from f85d0b32 to 766edbf0 185 days ago
ggerganov ggerganov force pushed from 766edbf0 to 62ba774b 185 days ago
ggerganov ggerganov marked this pull request as ready for review 185 days ago
ggerganov ggerganov requested a review from ngxson ngxson 185 days ago
ggerganov
slaren
slaren approved these changes on 2025-03-11
ggerganov ggerganov force pushed from 62ba774b to a170669c 181 days ago
ggerganov llama : refactor llama_context, llama_kv_cache, llm_build_context
55909257
ggerganov graph : don't mutate the KV cache during defrag
75624a20
ggerganov context : reduce virtuals + remove test function
5aa3518d
ggerganov context : move interface implementation to source file + factory
0a6648ca
ggerganov graph : move KV cache build functions to llama_context impl
cc9fa25a
ggerganov graph : remove model reference from build_pooling
29c9ef56
ggerganov graph : remove llama_model reference
bc825604
ggerganov kv_cache : provide rope factors
ff95ffdf
ggerganov graph : rework inputs to use only unique_ptr, remove attn input abstr…
562a4787
ggerganov context : remove llama_context_i abstraction
d0cb3196
ggerganov context : clean-up
a4fc4e8e
ggerganov graph : clean-up
af9f6b8e
ggerganov llama : remove redundant keywords (struct, enum)
226ff010
ggerganov model : adapt gemma3
5fc6dbd9
ggerganov ggerganov force pushed from a170669c to 5fc6dbd9 180 days ago
ggerganov graph : restore same attention ops as on master
70ef6530
ggerganov
ggerganov
ggerganov llama : remove TODO + fix indent
31b8eab5
ggerganov ggerganov merged e0dbec0b into master 179 days ago
ggerganov ggerganov deleted the gg/llama-kv-cache-v2 branch 179 days ago
ngxson
ggerganov
giladgd
ggerganov
giladgd
ggerganov
fairydreaming
ggerganov
fairydreaming
ggerganov
fairydreaming
ggerganov
fairydreaming
giladgd
ggerganov
giladgd
okaris
ggerganov
okaris
ggerganov

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone