llama.cpp
vulkan: Slang flash attention shader
#20451
Open

vulkan: Slang flash attention shader #20451

0cc4m wants to merge 6 commits into master from 0cc4m/vulkan-slang-flash-attention
0cc4m
github-actions github-actions added Vulkan
github-actions github-actions added ggml
jeffbolznv
jeffbolznv
0cc4m vulkan: port Flash Attention shader to Slang
a4ac1d90
0cc4m fix slang issues
e1b40fa5
0cc4m generic reductions
2c623bfa
0cc4m move kv shmem staging to function
0349025d
0cc4m Revert "move kv shmem staging to function"
e880cb2e
0cc4m unify scalar+vector and fix reduce function
5ec6569e
0cc4m 0cc4m force pushed from f43252d5 to 5ec6569e 91 days ago
0cc4m
csyonghe

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone