llama.cpp
llama : use n_swa + n_ubatch cells for SWA cache
#13833
Merged

llama : use n_swa + n_ubatch cells for SWA cache #13833

ggerganov merged 2 commits into master from gg/swa-optimize
ggerganov
github-actions github-actions added examples
github-actions github-actions added server
ggerganov ggerganov changed the title llama : use n_swa + n_ubatch cells for SWA cache + auto-batch llama : use n_swa + n_ubatch cells for SWA cache 103 days ago
ggerganov ggerganov force pushed from 1bce7e8d to 6468631d 103 days ago
ggerganov ggerganov changed the base branch from gg/kv-cache-simplify-part3 to gg/auto-batch 103 days ago
aviallon
ggerganov ggerganov force pushed from 6468631d to ef5bb61a 101 days ago
ggerganov ggerganov marked this pull request as ready for review 100 days ago
ggerganov ggerganov requested a review from ngxson ngxson 100 days ago
ngxson
ngxson approved these changes on 2025-05-30
Base automatically changed from gg/auto-batch to master 100 days ago
ggerganov llama : use n_swa + n_ubatch cells for SWA cache
4a9253a3
ggerganov ggerganov force pushed from ef5bb61a to 4a9253a3 100 days ago
ggerganov llama : add warning about multi-sqeuence SWA contexts
855b3974
ggerganov ggerganov force pushed from 83422957 to 855b3974 100 days ago
ggerganov ggerganov merged 3600cc28 into master 100 days ago
ggerganov ggerganov deleted the gg/swa-optimize branch 100 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone