llama.cpp
cuda : optimize argmax
#10441
Merged

cuda : optimize argmax #10441

slaren merged 6 commits into master from sl/cuda-opt-argmax
slaren
slaren cuda : optimize argmax
35386e89
github-actions github-actions added testing
github-actions github-actions added Nvidia GPU
slaren remove unused parameter
0a737d21
slaren fixup : use full warps
1e9447a0
ggerganov
ggerganov approved these changes on 2024-11-21
JohannesGaessler JohannesGaessler requested a review from JohannesGaessler JohannesGaessler 1 year ago
JohannesGaessler
JohannesGaessler commented on 2024-11-21
slaren Apply suggestions from code review
a734da71
slaren fix ub
316f3d31
JohannesGaessler
JohannesGaessler approved these changes on 2024-11-21
slaren ggml : check ne00 <= INT32_MAX in argmax and argsort
48f94d41
github-actions github-actions added ggml
slaren slaren merged a5e47592 into master 1 year ago
slaren slaren deleted the sl/cuda-opt-argmax branch 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone