onnxruntime
Optimize CUDA Kernel for 3D and 4D Transpose
#8928
Merged

Optimize CUDA Kernel for 3D and 4D Transpose #8928

SherlockNoMad merged 15 commits into master from bahuang/optimize_transpose
SherlockNoMad
SherlockNoMad Optimize Transpose120
9b70a80c
SherlockNoMad Optimize Transpose102
a7c3fc0f
SherlockNoMad SherlockNoMad added training
SherlockNoMad SherlockNoMad added core runtime
SherlockNoMad SherlockNoMad requested a review from weixingzhang weixingzhang 4 years ago
SherlockNoMad SherlockNoMad requested a review 4 years ago
weixingzhang
weixingzhang dismissed these changes on 2021-09-01
SherlockNoMad Generalize Transpose0123 for more input shapes
f73fece9
SherlockNoMad SherlockNoMad dismissed their stale review via f73fece9 4 years ago
SherlockNoMad SherlockNoMad requested a review from baijumeswani baijumeswani 4 years ago
SherlockNoMad SherlockNoMad requested a review from BowenBao BowenBao 4 years ago
SherlockNoMad SherlockNoMad requested a review from liqunfu liqunfu 4 years ago
SherlockNoMad SherlockNoMad requested a review from thiagocrepaldi thiagocrepaldi 4 years ago
SherlockNoMad SherlockNoMad requested a review from tlh20 tlh20 4 years ago
SherlockNoMad Fix build error
dfec92fb
SherlockNoMad Add debug log
c04302ca
SherlockNoMad Fix failing cases
5efdbda9
weixingzhang
weixingzhang commented on 2021-09-09
SherlockNoMad Add logging
0fd67d19
SherlockNoMad Relax check to run more cases
7f1ee3ac
SherlockNoMad Fix bug
42cf1a73
SherlockNoMad
SherlockNoMad commented on 2021-09-10
SherlockNoMad All test passing
c5202510
SherlockNoMad Add Transpose3D test cases
3299087c
SherlockNoMad adjuest order
f1289ca3
SherlockNoMad clean up
b7c20c40
SherlockNoMad SherlockNoMad force pushed from 367a0319 to b7c20c40 4 years ago
SherlockNoMad Fix build
a5dc0d49
SherlockNoMad update rocm kernel
3322210e
weixingzhang
weixingzhang approved these changes on 2021-09-13
ytaous
ytaous approved these changes on 2021-09-13
SherlockNoMad SherlockNoMad changed the title Optimize Transpose102 and Transpose120 for CUDA Optimize CUDA Kernel for 3D and 4D Transpose 4 years ago
baijumeswani
baijumeswani approved these changes on 2021-09-14
SherlockNoMad SherlockNoMad merged 9174cbe3 into master 4 years ago
SherlockNoMad SherlockNoMad deleted the bahuang/optimize_transpose branch 4 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone