llama.cpp
vulkan: Dynamic subgroup size support for Q6_K mat_vec
#10536

Merged

vulkan: Dynamic subgroup size support for Q6_K mat_vec #10536

0cc4m merged 6 commits into ggml-org:master from vulkan

subgroup 64 version with subgroup add. 15% faster

0aa5fd08

github-actions added Vulkan

github-actions added ggml

jeffbolznv commented on 2024-11-27

check for subgroup multiple of 16 and greater than 16

7c313b5f

jeffbolznv commented on 2024-11-28

Merge https://github.com/ggerganov/llama.cpp into vulkan

31a1d8af

subgroup sizes are always a power of 2 (https://github.com/KhronosGro…

97e0c686

force 16 sequential threads per block

2bca8122

jeffbolznv approved these changes on 2024-11-29

netrunnereve marked this pull request as ready for review 1 year ago

0cc4m approved these changes on 2024-11-29

make 16 subgroup size a constant

b65961bf

0cc4m approved these changes on 2024-11-30

0cc4m merged 0533e7fb into master 1 year ago

netrunnereve deleted the vulkan branch 1 year ago

Reviewers

0cc4m

jeffbolznv

Assignees

No one assigned

Labels

Vulkan ggml

Milestone

No milestone