llama.cpp
vulkan: small mul_mat_vec optimizations
#10665
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
44
Changes
View On
GitHub
Commits
dot and delta optimization
netrunnereve
committed
288 days ago
server : fix default draft model parameters (#10586)
netrunnereve
committed
286 days ago
github : minify link [no ci]
netrunnereve
committed
286 days ago
github : minify link [no ci] (revert)
netrunnereve
committed
286 days ago
metal : small-batch mat-mul kernels (#10581)
netrunnereve
committed
286 days ago
readme : add option, update default value, fix formatting (#10271)
netrunnereve
committed
286 days ago
llama : add missing LLAMA_API for llama_chat_builtin_templates (#10636)
netrunnereve
committed
286 days ago
metal : add `GGML_OP_CONV_TRANSPOSE_1D` kernels (ggml/1026)
netrunnereve
committed
286 days ago
feat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml/1019)
netrunnereve
committed
286 days ago
CUDA: remove unnecessary warp reduce in FA (ggml/1032)
netrunnereve
committed
286 days ago
sync : ggml
netrunnereve
committed
286 days ago
scripts : remove amx sync
netrunnereve
committed
286 days ago
server : (web ui) Various improvements, now use vite as bundler (#10599)
netrunnereve
committed
286 days ago
vulkan: optimize and reenable split_k (#10637)
netrunnereve
committed
286 days ago
clip : add sycl support (#10574)
netrunnereve
committed
286 days ago
Add docs for creating a static build (#10268) (#10630)
netrunnereve
committed
286 days ago
Avoid using __fp16 on ARM with old nvcc (#10616)
netrunnereve
committed
286 days ago
fix typo of README.md (#10605)
netrunnereve
committed
286 days ago
SYCL : Move to compile time oneMKL interface backend selection for NVIDIA backend (#10584)
netrunnereve
committed
286 days ago
remove a multiply
netrunnereve
committed
286 days ago
merge
netrunnereve
committed
286 days ago
Merge https://github.com/ggerganov/llama.cpp into vulkan
netrunnereve
committed
286 days ago
remove a multiply
netrunnereve
committed
286 days ago
additional small optimizations
netrunnereve
committed
286 days ago
Merge https://github.com/ggerganov/llama.cpp into vulkan
netrunnereve
committed
286 days ago
Merge branch 'ggerganov:master' into vulkan
netrunnereve
committed
285 days ago
Merge branch 'vulkan' of https://github.com/netrunnereve/llama.cpp into vulkan
netrunnereve
committed
284 days ago
Merge branch 'ggerganov:master' into vulkan
netrunnereve
committed
284 days ago
Merge branch 'vulkan' of https://github.com/netrunnereve/llama.cpp into vulkan
netrunnereve
committed
284 days ago
remove ifdefs
netrunnereve
committed
283 days ago
cleanup
netrunnereve
committed
283 days ago
double the number of rows per workgroup
netrunnereve
committed
283 days ago
Update ggml-vulkan.cpp
netrunnereve
committed
283 days ago
Vulkan: Add VK_EXT_subgroup_size_control support to ensure full subgroups for coopmats
0cc4m
committed
283 days ago
only increase the number of rows for amd and subgroup size 64
netrunnereve
committed
283 days ago
merge
netrunnereve
committed
283 days ago
fix missing NUM_ROWS for mul_mat_vec_iq4_nl_f16_f32, untested
netrunnereve
committed
283 days ago
Merge branch '0cc4m/vulkan-subgroup-size-control' of https://github.com/ggerganov/llama.cpp into vulkan
netrunnereve
committed
282 days ago
Merge https://github.com/ggerganov/llama.cpp into vulkan
netrunnereve
committed
282 days ago
use subgroup min and max to check for gcn (requires https://github.com/ggerganov/llama.cpp/pull/10721)
netrunnereve
committed
282 days ago
manual merge ggml-vulkan.cpp
netrunnereve
committed
278 days ago
fix conflict
netrunnereve
committed
278 days ago
set min and max subgroup size in any case
netrunnereve
committed
278 days ago
Also double the number of rows for Intel GPUs
0cc4m
committed
278 days ago
Loading