llama.cpp
cpu: introduce chunking for flash attention
#16829
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
1
Changes
View On
GitHub
cpu: introduce chunking for flash attention
#16829
ggerganov
merged 1 commit into
ggml-org:master
from
qualcomm:flashattn-chunking
cpu: introduce chunking for flash attention
e2364c9c
max-krasnyansky
requested a review
from
ggerganov
219 days ago
max-krasnyansky
requested a review
from
slaren
219 days ago
github-actions
added
ggml
slaren
approved these changes on 2025-10-30
ggerganov
merged
dcca0d3a
into master
217 days ago
max-krasnyansky
deleted the flashattn-chunking branch
217 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
slaren
ggerganov
Assignees
No one assigned
Labels
ggml
Milestone
No milestone
Login to write a write a comment.
Login via GitHub