onnxruntime
6cc57721 - Change CUDA implementation of Transpose to support all fixed size tensor types (#2387)

Commit

6 years ago

Change CUDA implementation of Transpose to support all fixed size tensor types (#2387) * Change CUDA implementation of Transpose to not use a typed kernel so we can support more types with minimum binary size. Add support for 8, 16, 32 and 64 bit types. Add unit tests. Add method so the implementation can be called directly (will be used by CUDA Scan very soon). * Disable TensorRT for MLFloat16 and int8 unit tests. * Address PR comment and add support for calling cublas implementation if type is mlfloat16.

References

#2387 - Change CUDA implementation of Transpose to support all fixed size tensor types

Author

skottmckay

Parents

109b3cb4

onnxruntime 6cc57721 - Change CUDA implementation of Transpose to support all fixed size tensor types (#2387)

onnxruntime
6cc57721 - Change CUDA implementation of Transpose to support all fixed size tensor types (#2387)