llama.cpp
98197e5c - vulkan: optimizations for deepseek prompt processing (#14555)

Commit
63 days ago
vulkan: optimizations for deepseek prompt processing (#14555) * vulkan: allow unclamped loads in coopmat2 mul_mat_id shader * vulkan: increase coopmat2 mul_mat_id tile size * vulkan: optimize mat_mul_id row_ids search to batch loads, and port to coopmat1 path * vulkan: use smaller FA row size when head size is large. applies to both scalar and CM2 paths (CM1 isn't used due to shared memory limits)
Author
Parents
Loading