Vulkan Improvements #5835
Improve dequant shaders, add fast q4_0 dequant
b4172ca2
Optimize dmmv non-kquants for GCN
52055970
Fix q4_0 dequant dispatch sizes
5169f928
Optimize dequant shaders for q4_1, q5_0, q5_1 and q8_0
2a0cf851
Add unary and binary op shader templates
5e0a9a2d
Fix Vulkan check results
c19c5981
Enable non-contiguous support for simple ops
0caf8dc9
Add argsort
aa0f428e
Merge upstream changes, fix conflicts
93cdea1d
Speed up q4_0 dequant code, enable mmq for q4_0
6314096d
Rework matmul pipeline selection
c3eba7c1
Add soft_max alibi support
2acb2811
Add q4_1, q5_0, q5_1 and q8_0 dequant mat mat mul shaders
a8eeab2d
Merge upstream changes, fix conflicts
f4ec9a06
Add environment variable GGML_VK_FORCE_MAX_ALLOCATION_SIZE to limit m…
a6042049
ggerganov
approved these changes
on 2024-03-05
0cc4m
merged
61d1c88e
into master 2 years ago
0cc4m
deleted the 0cc4m/vulkan-improvements branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub