llama.cpp
Vulkan Improvements
#5835
Merged

Vulkan Improvements #5835

0cc4m merged 15 commits into master from 0cc4m/vulkan-improvements
0cc4m
0cc4m Improve dequant shaders, add fast q4_0 dequant
b4172ca2
0cc4m Optimize dmmv non-kquants for GCN
52055970
0cc4m Fix q4_0 dequant dispatch sizes
5169f928
0cc4m Optimize dequant shaders for q4_1, q5_0, q5_1 and q8_0
2a0cf851
0cc4m Add unary and binary op shader templates
5e0a9a2d
0cc4m Fix Vulkan check results
c19c5981
0cc4m Enable non-contiguous support for simple ops
0caf8dc9
0cc4m Add argsort
aa0f428e
0cc4m Merge upstream changes, fix conflicts
93cdea1d
0cc4m Speed up q4_0 dequant code, enable mmq for q4_0
6314096d
0cc4m Rework matmul pipeline selection
c3eba7c1
0cc4m Add soft_max alibi support
2acb2811
0cc4m Add q4_1, q5_0, q5_1 and q8_0 dequant mat mat mul shaders
a8eeab2d
0cc4m Merge upstream changes, fix conflicts
f4ec9a06
0cc4m Add environment variable GGML_VK_FORCE_MAX_ALLOCATION_SIZE to limit m…
a6042049
Artefact2
sorasoras
0cc4m
0cc4m
Nindaleth
Nindaleth
daniandtheweb
0cc4m
ggerganov
ggerganov approved these changes on 2024-03-05
0cc4m 0cc4m merged 61d1c88e into master 2 years ago
0cc4m 0cc4m deleted the 0cc4m/vulkan-improvements branch 2 years ago
akingoverlook

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone