llama.cpp
llama : use n_swa + n_ubatch cells for SWA cache
#13833

Merged

llama : use n_swa + n_ubatch cells for SWA cache #13833

ggerganov merged 2 commits into master from gg/swa-optimize

github-actions added examples

github-actions added server

ggerganov changed the title ~~llama : use n_swa + n_ubatch cells for SWA cache + auto-batch~~ llama : use n_swa + n_ubatch cells for SWA cache 103 days ago

ggerganov force pushed from 1bce7e8d to 6468631d 103 days ago

ggerganov changed the base branch from gg/kv-cache-simplify-part3 to gg/auto-batch 103 days ago

ggerganov force pushed from 6468631d to ef5bb61a 101 days ago

ggerganov marked this pull request as ready for review 100 days ago

ggerganov requested a review from

ngxson 100 days ago

ngxson approved these changes on 2025-05-30

Base automatically changed from gg/auto-batch to master 100 days ago

llama : use n_swa + n_ubatch cells for SWA cache

4a9253a3

ggerganov force pushed from ef5bb61a to 4a9253a3 100 days ago

llama : add warning about multi-sqeuence SWA contexts

855b3974

ggerganov force pushed from 83422957 to 855b3974 100 days ago

ggerganov merged 3600cc28 into master 100 days ago

ggerganov deleted the gg/swa-optimize branch 100 days ago

Reviewers

ngxson

Assignees

No one assigned

Labels

examples server

Milestone

No milestone

llama.cpp llama : use n_swa + n_ubatch cells for SWA cache #13833 Merged

llama : use n_swa + n_ubatch cells for SWA cache #13833

llama.cpp
llama : use n_swa + n_ubatch cells for SWA cache
#13833

Merged