onnxruntime
9174cbe3
- Optimize CUDA Kernel for 3D and 4D Transpose (#8928)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
4 years ago
Optimize CUDA Kernel for 3D and 4D Transpose (#8928) * Optimize Transpose120 and Transpose102 * Generalize Transpose0123 for more input shapes * Add Transpose3D test cases * update rocm kernel
References
#8928 - Optimize CUDA Kernel for 3D and 4D Transpose
Author
SherlockNoMad
Parents
5969d576
Loading