TRT RTX EP changes (#25015)
### Description
<!-- Describe your changes. -->
- don't use cuda runtime API to set the device when a stream is already
provided.
- expose option to set limit on max shared memory TensorRT can use.
- Fixed the Compilation issues for the deprecated APIs
- Small test fix.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
---------
Co-authored-by: Ankan Banerjee <anbanerjee@nvidia.com>