llama.cpp
Vulkan k-quant mmq and ggml-backend offload functionality
#6155
Merged

Vulkan k-quant mmq and ggml-backend offload functionality #6155

0cc4m merged 12 commits into master from 0cc4m/vulkan-improvements
0cc4m
0cc4m Fix Vulkan no kv offload incoherence
492ad4b0
0cc4m Add k-quant mul mat mat shaders
cb6636e0
0cc4m Merge remote-tracking branch 'origin/master' into 0cc4m/vulkan-improv…
f315402d
0cc4m Rework working buffer allocation, reduces vram use noticeably
86386e2c
0cc4m 0cc4m requested a review from slaren slaren 2 years ago
slaren
0cc4m
slaren
slaren
slaren
slaren approved these changes on 2024-03-19
0cc4m Default to all dedicated GPUs
bcdd6531
0cc4m Add fallback for integrated GPUs if no dedicated GPUs are found
8ddd557d
0cc4m
MaggotHATE
daniandtheweb
0cc4m Add debug info which device is allocating memory
24e5039f
Nindaleth
0cc4m Fix Intel dequant issue
1fceeb90
0cc4m
slaren
0cc4m
netrunnereve
0cc4m Fix Vulkan GGML_OP_GET_ROWS implementation
d00b11b0
0cc4m Merge upstream changes, fix conflicts
6cb07fb0
0cc4m
0cc4m
slaren
slaren commented on 2024-03-27
slaren
slaren commented on 2024-03-27
0cc4m Clean up merge artifacts
0cda5679
MaggotHATE
0cc4m
0cc4m Remove Vulkan warning
b7863ab7
0cc4m
0cc4m 0cc4m merged ba0c7c70 into master 2 years ago
0cc4m 0cc4m deleted the 0cc4m/vulkan-improvements branch 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone