vulkan: small mul_mat_vec optimizations #10665
dot and delta optimization
b7ad2345
server : fix default draft model parameters (#10586)
be2d0048
github : minify link [no ci]
ed8649f8
github : minify link [no ci] (revert)
ca7c2135
metal : small-batch mat-mul kernels (#10581)
d6753d70
readme : add option, update default value, fix formatting (#10271)
d37b7e09
llama : add missing LLAMA_API for llama_chat_builtin_templates (#10636)
f697baf8
metal : add `GGML_OP_CONV_TRANSPOSE_1D` kernels (ggml/1026)
e92a46be
feat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml/1019)
2b155906
CUDA: remove unnecessary warp reduce in FA (ggml/1032)
0df0452a
sync : ggml
69c7f204
scripts : remove amx sync
f8fe71ab
server : (web ui) Various improvements, now use vite as bundler (#10599)
70f0346f
vulkan: optimize and reenable split_k (#10637)
fa9abd6c
clip : add sycl support (#10574)
0fa9dc4c
Add docs for creating a static build (#10268) (#10630)
0a81a82f
Avoid using __fp16 on ARM with old nvcc (#10616)
9075271c
fix typo of README.md (#10605)
4153d57c
SYCL : Move to compile time oneMKL interface backend selection for NV…
e147054b
remove a multiply
062f256e
merge
fe811349
Merge https://github.com/ggerganov/llama.cpp into vulkan
c403d895
remove a multiply
5fbaf121
additional small optimizations
2f56bac7
Merge https://github.com/ggerganov/llama.cpp into vulkan
591894a0
Merge branch 'ggerganov:master' into vulkan
0b1b7c85
0cc4m
commented
on 2024-12-06
Merge branch 'vulkan' of https://github.com/netrunnereve/llama.cpp in…
4eefebc8
Merge branch 'ggerganov:master' into vulkan
32b994e8
Merge branch 'vulkan' of https://github.com/netrunnereve/llama.cpp in…
4b65c6b9
remove ifdefs
f5a15fc6
cleanup
bd17bc45
double the number of rows per workgroup
4a185ad3
Update ggml-vulkan.cpp
984d4707
Vulkan: Add VK_EXT_subgroup_size_control support to ensure full subgr…
595c1a7d
only increase the number of rows for amd and subgroup size 64
6de28665
merge
bfecabeb
fix missing NUM_ROWS for mul_mat_vec_iq4_nl_f16_f32, untested
1c163674
0cc4m
commented
on 2024-12-08
Merge branch '0cc4m/vulkan-subgroup-size-control' of https://github.c…
8972f1d3
Merge https://github.com/ggerganov/llama.cpp into vulkan
c7bc42ce
use subgroup min and max to check for gcn (requires https://github.co…
9af9e801
manual merge ggml-vulkan.cpp
d9c6bf16
fix conflict
8b13f2d0
netrunnereve
marked this pull request as ready for review 273 days ago
set min and max subgroup size in any case
1aa26d78
netrunnereve
marked this pull request as ready for review 273 days ago
Also double the number of rows for Intel GPUs
20b47d4d
0cc4m
merged
64ae0655
into master 273 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub