llama.cpp
metal: speed up Qwen3-VL image encoding on large images by ~11%
#21443

Open

metal: speed up Qwen3-VL image encoding on large images by ~11% #21443

Avidanborisov wants to merge 3 commits into ggml-org:master from Avidanborisov:metal-img-encode-optim

metal: make flash attention support 16 queries per threadgroup

c0788eb3

metal: use 16 queries per threadgroup and 8 simdgroups in flash atten…

23bd7621

Avidanborisov requested a review 4 days ago

github-actions added ggml

github-actions added Apple Metal

Merge branch 'ggml-org:master' into metal-img-encode-optim

7ef7cb05

Reviewers

No reviews

Assignees

No one assigned

Labels

ggml Apple Metal

Milestone

No milestone