onnxruntime
Optimize CUDA Sum op kernel and refactor CUDA elementwise variadic input op kernels
#4418
Merged

Optimize CUDA Sum op kernel and refactor CUDA elementwise variadic input op kernels #4418

edgchen1 merged 8 commits into master from edgchen1/sum_optimization
edgchen1
edgchen1 Initial implementation.
e4980782
edgchen1 Fixes, clean up, test.
dd4fe4f3
edgchen1 Clean up kernel for loops, use binary impl for 2 input and no broadca…
5a0731ca
edgchen1 edgchen1 added training
edgchen1 edgchen1 added core runtime
edgchen1 edgchen1 requested a review from wschin wschin 5 years ago
edgchen1 edgchen1 requested a review from SherlockNoMad SherlockNoMad 5 years ago
edgchen1 edgchen1 requested a review from ashbhandare ashbhandare 5 years ago
edgchen1 edgchen1 requested a review from weixingzhang weixingzhang 5 years ago
edgchen1 edgchen1 requested a review from HectorSVC HectorSVC 5 years ago
edgchen1 edgchen1 requested a review 5 years ago
edgchen1
edgchen1 commented on 2020-07-03
edgchen1
edgchen1 commented on 2020-07-03
edgchen1 Address comments.
0f57ae74
edgchen1 Fix warning.
5bab69ad
edgchen1
edgchen1 Optimize Sum kernel. Use local variable to store intermediate output …
04f27de6
HectorSVC HectorSVC requested a review from ke1337 ke1337 5 years ago
HectorSVC
HectorSVC dismissed these changes on 2020-07-09
weixingzhang
weixingzhang commented on 2020-07-09
wschin
wschin commented on 2020-07-09
wschin
wschin dismissed these changes on 2020-07-09
edgchen1 Address PR comment.
24aa8dea
edgchen1 edgchen1 dismissed their stale review via 24aa8dea 5 years ago
edgchen1 edgchen1 dismissed their stale review via 24aa8dea 5 years ago
weixingzhang
weixingzhang commented on 2020-07-09
edgchen1 Address PR comments.
cd70f833
weixingzhang
weixingzhang approved these changes on 2020-07-10
edgchen1 edgchen1 merged 6c7da5e9 into master 5 years ago
edgchen1 edgchen1 deleted the edgchen1/sum_optimization branch 5 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone