llama.cpp
vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations
#11595
Merged

vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations #11595

remyoudompheng
github-actions github-actions added Vulkan
github-actions github-actions added devops
github-actions github-actions added ggml
remyoudompheng
jeffbolznv
netrunnereve
remyoudompheng
remyoudompheng vulkan: implement specialized MMV kernels for IQ2 quantizations
b80033ef
remyoudompheng vulkan: add MMV kernels for IQ3 quants
e3228c74
remyoudompheng vulkan: Increase MMV batch size and unroll IQ LUT setup
c263f8f3
remyoudompheng vulkan: fix init_iq_shmem for WG sizes larger than tables
8608322f
remyoudompheng remyoudompheng force pushed to 8608322f 1 year ago
remyoudompheng remyoudompheng marked this pull request as ready for review 1 year ago
remyoudompheng
jeffbolznv
jeffbolznv commented on 2025-02-16
jeffbolznv
netrunnereve
remyoudompheng vulkan: common batch size for all I-quants
cfea4ddb
jeffbolznv
jeffbolznv approved these changes on 2025-02-18
alexjp
jeffbolznv
0cc4m
jeffbolznv
0cc4m
0cc4m
0cc4m approved these changes on 2025-02-28
0cc4m 0cc4m merged 438a8392 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone