llama.cpp
cpu: introduce chunking for flash attention
#16829
Merged

cpu: introduce chunking for flash attention #16829

max-krasnyansky
max-krasnyansky cpu: introduce chunking for flash attention
e2364c9c
max-krasnyansky max-krasnyansky requested a review from ggerganov ggerganov 219 days ago
max-krasnyansky max-krasnyansky requested a review from slaren slaren 219 days ago
github-actions github-actions added ggml
max-krasnyansky
slaren
slaren
slaren approved these changes on 2025-10-30
ggerganov
ggerganov ggerganov merged dcca0d3a into master 217 days ago
max-krasnyansky max-krasnyansky deleted the flashattn-chunking branch 217 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone