PR #10441 cuda : optimize argmax

cuda : optimize argmax #10441

slaren merged 6 commits into master from sl/cuda-opt-argmax

cuda : optimize argmax

35386e89

github-actions added testing

github-actions added Nvidia GPU

remove unused parameter

0a737d21

fixup : use full warps

1e9447a0

ggerganov approved these changes on 2024-11-21

JohannesGaessler requested a review from

JohannesGaessler 1 year ago

JohannesGaessler commented on 2024-11-21

Apply suggestions from code review

a734da71

fix ub

316f3d31

JohannesGaessler approved these changes on 2024-11-21

ggml : check ne00 <= INT32_MAX in argmax and argsort

48f94d41

github-actions added ggml

slaren merged a5e47592 into master 1 year ago

slaren deleted the sl/cuda-opt-argmax branch 1 year ago

Reviewers

JohannesGaessler

ggerganov

Assignees

No one assigned

Labels

testing Nvidia GPU ggml

Milestone

No milestone