llama.cpp
CUDA: add dynamic shared mem to softmax, refactor general usage
#14497

Merged

CUDA: add dynamic shared mem to softmax, refactor general usage #14497

am17an merged 3 commits into ggml-org:master from am17an:cuda_increase_shared_mem_limits

am17an requested a review from

JohannesGaessler 128 days ago

JohannesGaessler commented on 2025-07-02

github-actions added testing

github-actions added Nvidia GPU

github-actions added ggml

am17an force pushed from 6429086b to 4c7bcaab 128 days ago

am17an force pushed from 4c7bcaab to a67ef5c0 128 days ago

am17an requested a review from

JohannesGaessler 128 days ago

CUDA: add dynamic shared mem to softmax, refactor general usage

b9bcb7d7

Review: refactor switch statement, change cross_entropy to use full size

34e5142d

rebase

7b162818

am17an force pushed from a67ef5c0 to 7b162818 127 days ago

JohannesGaessler approved these changes on 2025-07-02

am17an merged 55c2646b into master 127 days ago

am17an deleted the cuda_increase_shared_mem_limits branch 127 days ago

Reviewers

JohannesGaessler

Assignees

No one assigned

Labels

testing Nvidia GPU ggml

Milestone

No milestone