llama.cpp
vulkan: Use fp16 for the flash attention P*V multiplication
#12783
Merged

vulkan: Use fp16 for the flash attention P*V multiplication #12783

jeffbolznv
jeffbolznv vulkan: Use fp16 for the flash attention P*V multiplication
dab1f028
github-actions github-actions added Vulkan
github-actions github-actions added ggml
jeffbolznv jeffbolznv requested a review from 0cc4m 0cc4m 154 days ago
0cc4m
0cc4m
0cc4m approved these changes on 2025-04-09
0cc4m 0cc4m merged 7ecd780b into master 151 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone