PR #13291 cuda: refactored ssm_scan and use CUB

cuda: refactored ssm_scan and use CUB #13291

JohannesGaessler merged 8 commits into ggml-org:master from Your-Cheese:ssm_scan_cub

cuda: refactored ssm_scan to use CUB

b2f8eea9

github-actions added Nvidia GPU

github-actions added ggml

fixed compilation error when when not using CUB

c7d4d45f

Your-Cheese force pushed to c7d4d45f 1 year ago

JohannesGaessler commented on 2025-05-06

assign L to constant and use size_t instead of int

949e4fa2

deduplicated functions

75520d67

change min blocks per mp to 1

7e559f3e

Use cub load and store warp transpose

7d259d9e

Merge https://github.com/ggml-org/llama.cpp into ssm_scan_cub

ae519a48

IMbackK dismissed these changes on 2025-08-06

suppress clang warning

dd6ff8e5

JohannesGaessler approved these changes on 2025-08-09

JohannesGaessler dismissed their stale review 351 days ago

Said suppression of warning was sufficient.

JohannesGaessler merged 79c1160b into master 351 days ago

Reviewers

JohannesGaessler

IMbackK

Assignees

No one assigned

Labels

Nvidia GPU ggml

Milestone

No milestone