llama.cpp
cuda: refactored ssm_scan and use CUB
#13291
Merged

cuda: refactored ssm_scan and use CUB #13291

Your-Cheese
Your-Cheese cuda: refactored ssm_scan to use CUB
b2f8eea9
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
Your-Cheese fixed compilation error when when not using CUB
c7d4d45f
Your-Cheese Your-Cheese force pushed from 3a454c91 to c7d4d45f 159 days ago
JohannesGaessler
JohannesGaessler commented on 2025-05-06
Your-Cheese assign L to constant and use size_t instead of int
949e4fa2
Your-Cheese
Your-Cheese deduplicated functions
75520d67
Your-Cheese change min blocks per mp to 1
7e559f3e
Your-Cheese Use cub load and store warp transpose
7d259d9e
Your-Cheese Merge https://github.com/ggml-org/llama.cpp into ssm_scan_cub
ae519a48
JohannesGaessler
JohannesGaessler
Your-Cheese
Your-Cheese
IMbackK
IMbackK dismissed these changes on 2025-08-06
Your-Cheese suppress clang warning
dd6ff8e5
JohannesGaessler
JohannesGaessler approved these changes on 2025-08-09
JohannesGaessler JohannesGaessler dismissed their stale review 62 days ago
Said suppression of warning was sufficient.
JohannesGaessler JohannesGaessler merged 79c1160b into master 62 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone