[TensorRT] Fix DDS output bug during engine update (#26272)
### Description
Fix a bug in the TRT Execution Provider where the DDS output tensor was
not bound after an engine update.
### Motivation and Context
The `dds_output_allocator_map` is not cleared on engine update, so that
it will mis-recognized as a known DDS and will not bind the output
allocation.
Script to reproduce the issue:
```:python
# create an onnx model with:
# inputs: data -> NonZeros(data) -> GatherND -> output
# then run the model with onnxruntime
def create_model():
import onnx
from onnx import helper, TensorProto
input = helper.make_tensor_value_info("data", TensorProto.FLOAT, ["d1", "d2"])
output = helper.make_tensor_value_info("output", TensorProto.FLOAT, ["nzr"])
nonzeros_node = helper.make_node("NonZero", ["data"], ["nonzeros"], "nonzeros_node")
transpose_node = helper.make_node(
"Transpose", ["nonzeros"], ["nonzeros_t"], "transpose_node"
)
gathernd_node = helper.make_node(
"GatherND", ["data", "nonzeros_t"], ["output"], "gathernd_node"
)
value_info = [
helper.make_tensor_value_info("nonzeros", TensorProto.INT64, [2, "nzr"]),
helper.make_tensor_value_info("nonzeros_t", TensorProto.INT64, ["nzr", 2]),
]
graph = helper.make_graph(
[nonzeros_node, transpose_node, gathernd_node],
"test_graph",
[input],
[output],
value_info=value_info,
)
model = helper.make_model(graph)
onnx.save(model, "model_dds.onnx")
def run_model():
import onnxruntime as ort
import numpy as np
sess = ort.InferenceSession("model_dds.onnx", providers=["TensorrtExecutionProvider", "CUDAExecutionProvider", "CPUExecutionProvider"])
print("Running with data shape (3,4)")
data = np.random.randn(3, 4).astype(np.float32)
sess.run(None, {"data": data})
print("Running with data shape (5,6)")
data = np.random.randn(5, 6).astype(np.float32)
sess.run(None, {"data": data})
create_model()
run_model()
```
Before the change:
> IExecutionContext::enqueueV3: Error Code 3: API Usage Error (Parameter
check failed, condition:
mContext.profileObliviousBindings.at(profileObliviousIndex) ||
getPtrOrNull(mOutputAllocators, profileObliviousIndex). Neither address
or allocator is set for output tensor scores. Call
setOutputTensorAddress, setTensorAddress or setOutputAllocator before
enqueue/execute.) ... Status Message: TensorRT EP execution context
enqueue failed.