onnxruntime
22543919 - Optimize Concat and Split on CUDA to eliminate host-to-device copies when sizes are all the same (#8833)

Commit

4 years ago

Optimize Concat and Split on CUDA to eliminate host-to-device copies when sizes are all the same (#8833) * special case concat and split when sizes are equal * add tests for 16 and 32 inputs with same dim * add tests for 16/64 inputs on concat or 16/64 outputs on split * try eliminate windows warning * outter => outer

Author

Suffian Khan

Parents

85898929

onnxruntime 22543919 - Optimize Concat and Split on CUDA to eliminate host-to-device copies when sizes are all the same (#8833)

onnxruntime
22543919 - Optimize Concat and Split on CUDA to eliminate host-to-device copies when sizes are all the same (#8833)