PR #18496 ggml-cuda: fixes for concurrent streams

ggml-cuda: fixes for concurrent streams #18496

am17an merged 4 commits into ggml-org:master from am17an:graph-opt-fix

github-actions added Nvidia GPU

github-actions added ggml

am17an force pushed from 901c8243 to 14642162 95 days ago

am17an force pushed from 7e2dcb71 to 1fff4fc6 95 days ago

am17an force pushed from 1fff4fc6 to 918ebb95 95 days ago

am17an force pushed from 918ebb95 to 0bb52944 95 days ago

am17an force pushed from 0bb52944 to b3a9b4ca 95 days ago

ggml-cuda: enable concurrent streams by default

25ae7986

am17an force pushed from b3a9b4ca to 25ae7986 95 days ago

make flag opt-in

93cfa8d1

ggerganov commented on 2026-01-02

add todo about special casing

d405fa1c

am17an requested a review from

JohannesGaessler 92 days ago

JohannesGaessler approved these changes on 2026-01-03

am17an changed the title ~~ggml-cuda: enable concurrent streams by default~~ ggml-cuda: fixes for concurrent streams 92 days ago

update comment

b423920f

am17an force pushed from c44291b0 to b423920f 92 days ago

am17an merged e57f5233 into master 92 days ago

am17an deleted the graph-opt-fix branch 91 days ago

Reviewers

JohannesGaessler

ggerganov

Assignees

No one assigned

Labels

Nvidia GPU ggml

Milestone

No milestone