onnxruntime
d9730c7f - [TensorRT EP] Fix bug for DDS output handling for empty tensor (#19575)

Commit

1 year ago

[TensorRT EP] Fix bug for DDS output handling for empty tensor (#19575) When the DDS output is empty tensor (i.e. any of the dimension is 0), TRT EP won't perform either cudaMemcpyAsync() nor cuda::Impl_Cast(), to prevent accidentally overwriting other location that might belong to other tensors. This PR also refactors the code to only allocate single bytes for all empty tensors. #TODO: add unit tests to cover the DDS code paths or doing more testing with concurrent,sequential, threaded faster-rcnn using onnx_test_runner and verifying outputs --------- Co-authored-by: Chi Lo <lochi@microsoft.com>

References

#19575 - [TensorRT EP] Fix bug for DDS output handling for empty tensor

Author

chilo-ms

Parents

1e78bcea

onnxruntime d9730c7f - [TensorRT EP] Fix bug for DDS output handling for empty tensor (#19575)

onnxruntime
d9730c7f - [TensorRT EP] Fix bug for DDS output handling for empty tensor (#19575)