llama.cpp
ggml-cpu: Use tiled FA for prompt-processing
#19012
Merged

ggml-cpu: Use tiled FA for prompt-processing #19012

am17an merged 6 commits into ggml-org:master from am17an:tile-fa-cpu
am17an
am17an am17an requested a review from ggerganov ggerganov 155 days ago
github-actions github-actions added ggml
am17an am17an force pushed from df10660b to 97afbcbb 155 days ago
am17an am17an force pushed from 97afbcbb to 41a07185 155 days ago
am17an ggml-cpu: Use tiled FA for prompt-processing
2f09b2d3
am17an am17an force pushed from 41a07185 to 2f09b2d3 154 days ago
ggerganov
ggerganov commented on 2026-01-23
am17an fix out of bounds for mask
e30395e5
am17an
ggerganov
am17an skip rows where there are all masks
693935d9
am17an
ggerganov
ggerganov commented on 2026-01-24
am17an skip tile if mask is inf
d898d43a
ggerganov
ggerganov commented on 2026-01-24
ggerganov
ggerganov commented on 2026-01-24
am17an store mask in worksize
dc30629d
am17an am17an force pushed from c1dbc374 to dc30629d 152 days ago
ggerganov
ggerganov approved these changes on 2026-01-25
am17an check inf tile earlier
17f7db50
am17an am17an merged bcb43163 into master 151 days ago
am17an am17an deleted the tile-fa-cpu branch 151 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone