llama.cpp
vulkan: small mul_mat_vec optimizations
#10665
Merged

vulkan: small mul_mat_vec optimizations #10665

0cc4m merged 44 commits into ggml-org:master from vulkan
netrunnereve
netrunnereve dot and delta optimization
b7ad2345
ggerganov server : fix default draft model parameters (#10586)
be2d0048
ggerganov github : minify link [no ci]
ed8649f8
ggerganov github : minify link [no ci] (revert)
ca7c2135
ggerganov metal : small-batch mat-mul kernels (#10581)
d6753d70
pothitos readme : add option, update default value, fix formatting (#10271)
d37b7e09
ngxson llama : add missing LLAMA_API for llama_chat_builtin_templates (#10636)
f697baf8
PABannier metal : add `GGML_OP_CONV_TRANSPOSE_1D` kernels (ggml/1026)
e92a46be
PABannier feat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml/1019)
2b155906
mahorozte CUDA: remove unnecessary warp reduce in FA (ggml/1032)
0df0452a
ggerganov sync : ggml
69c7f204
ggerganov scripts : remove amx sync
f8fe71ab
ngxson server : (web ui) Various improvements, now use vite as bundler (#10599)
70f0346f
jeffbolznv vulkan: optimize and reenable split_k (#10637)
fa9abd6c
piDack clip : add sycl support (#10574)
0fa9dc4c
mostlygeek Add docs for creating a static build (#10268) (#10630)
0a81a82f
frankier Avoid using __fp16 on ARM with old nvcc (#10616)
9075271c
WrRan fix typo of README.md (#10605)
4153d57c
s-Nick SYCL : Move to compile time oneMKL interface backend selection for NV…
e147054b
netrunnereve remove a multiply
062f256e
netrunnereve netrunnereve requested a review from 0cc4m 0cc4m 281 days ago
github-actions github-actions added Vulkan
github-actions github-actions added ggml
netrunnereve merge
fe811349
netrunnereve Merge https://github.com/ggerganov/llama.cpp into vulkan
c403d895
netrunnereve remove a multiply
5fbaf121
netrunnereve additional small optimizations
2f56bac7
netrunnereve Merge https://github.com/ggerganov/llama.cpp into vulkan
591894a0
jeffbolznv
ggerganov
0cc4m
ggerganov
netrunnereve
jeffbolznv
netrunnereve Merge branch 'ggerganov:master' into vulkan
0b1b7c85
0cc4m
0cc4m commented on 2024-12-06
netrunnereve Merge branch 'vulkan' of https://github.com/netrunnereve/llama.cpp in…
4eefebc8
netrunnereve Merge branch 'ggerganov:master' into vulkan
32b994e8
netrunnereve Merge branch 'vulkan' of https://github.com/netrunnereve/llama.cpp in…
4b65c6b9
netrunnereve remove ifdefs
f5a15fc6
netrunnereve cleanup
bd17bc45
netrunnereve
netrunnereve double the number of rows per workgroup
4a185ad3
netrunnereve
netrunnereve Update ggml-vulkan.cpp
984d4707
jeffbolznv
ggerganov
0cc4m Vulkan: Add VK_EXT_subgroup_size_control support to ensure full subgr…
595c1a7d
netrunnereve only increase the number of rows for amd and subgroup size 64
6de28665
netrunnereve merge
bfecabeb
netrunnereve fix missing NUM_ROWS for mul_mat_vec_iq4_nl_f16_f32, untested
1c163674
netrunnereve
ggerganov
0cc4m
0cc4m commented on 2024-12-08
netrunnereve Merge branch '0cc4m/vulkan-subgroup-size-control' of https://github.c…
8972f1d3
netrunnereve Merge https://github.com/ggerganov/llama.cpp into vulkan
c7bc42ce
netrunnereve use subgroup min and max to check for gcn (requires https://github.co…
9af9e801
netrunnereve
jeffbolznv
ggerganov
netrunnereve
ggerganov
jeffbolznv
jeffbolznv
jeffbolznv
netrunnereve netrunnereve marked this pull request as draft 275 days ago
netrunnereve
0cc4m
0cc4m
netrunnereve manual merge ggml-vulkan.cpp
d9c6bf16
netrunnereve fix conflict
8b13f2d0
netrunnereve netrunnereve marked this pull request as ready for review 273 days ago
jeffbolznv
jeffbolznv approved these changes on 2024-12-12
netrunnereve netrunnereve marked this pull request as draft 273 days ago
netrunnereve
netrunnereve set min and max subgroup size in any case
1aa26d78
netrunnereve netrunnereve marked this pull request as ready for review 273 days ago
netrunnereve
0cc4m
0cc4m Also double the number of rows for Intel GPUs
20b47d4d
0cc4m
0cc4m 0cc4m merged 64ae0655 into master 273 days ago
netrunnereve netrunnereve deleted the vulkan branch 272 days ago
netrunnereve

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone