llama.cpp
CUDA: Optimize PAD_REFLECT_1D
#15957
Merged

CUDA: Optimize PAD_REFLECT_1D #15957

bugparty
bugparty CUDA: Optimize PAD_REFLECT_1D
1e29fafa
github-actions github-actions added testing
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
yosh20004
bugparty
JohannesGaessler
JohannesGaessler commented on 2025-09-14
yosh20004
bugparty use fast_div to improve performance
9494833b
bugparty Apply suggestion from @JohannesGaessler
85835527
bugparty Apply suggestion from @JohannesGaessler
a5ef1d09
JohannesGaessler
JohannesGaessler commented on 2025-09-15
JohannesGaessler
JohannesGaessler commented on 2025-09-15
bugparty
bugparty optimize
b3cf133a
JohannesGaessler
bugparty use a concise expression to further speedup the cuda kernel
d73ba84a
bugparty add comment for rel_i0
e280cb87
bugparty
bugparty
bugparty commented on 2025-09-15
bugparty
bugparty commented on 2025-09-15
bugparty Merge branch 'ggml-org:master' into PAD_REFLECT_1D_expriment
188ce93e
bugparty Merge branch 'ggml-org:master' into PAD_REFLECT_1D_expriment
4286ea78
bugparty Merge branch 'ggml-org:master' into PAD_REFLECT_1D_expriment
dd6789b1
bugparty bugparty requested a review from JohannesGaessler JohannesGaessler 163 days ago
bugparty Merge branch 'ggml-org:master' into PAD_REFLECT_1D_expriment
aa12620c
JohannesGaessler
JohannesGaessler approved these changes on 2025-09-18
JohannesGaessler JohannesGaessler merged 38dbdf4c into master 162 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone