[TensorRT EP] Add new provider option to exclude ops from running on TRT (#23705)
This PR removes the implicit filtering-out DDS ops from running on TRT.
In other words, by default, DDS nodes will be run by TRT if it supports.
Moreover, it adds new provider option `trt_op_types_to_exclude`:
- User can provide op type list to be excluded from running on TRT
- e.g. `trt_op_types_to_exclude="NonMaxSuppression,NonZero,RoiAlignl"`
(This PR basically adds back
[feature](https://github.com/microsoft/onnxruntime/pull/22681)that
previously being held to merge.)
[Note]
There may be potential performance issues in TRT 10 when running models
that contain DDS operations such as NonMaxSuppression, NonZero, and
RoiAlign (e.g., Faster-RCNN).
If user encounters significant performance degradation, we suggest
specifying those DDS ops to be excluded from running by TRT, i.e.
trt_op_types_to_exclude=\"NonMaxSuppression,NonZero,RoiAlign\". Those
DDS nodes will be run by CUDA EP or CPU.