Enable global TRT timing cache (#17865)
I am adding a new `trt_timing_cache_path` option. Internally it is
handled as `global_cache_path_` and will be set via a fall through
approach:
1. no path provided => workdir
2. `trt_engine_cache_path` provided but no `trt_timing_cache_path` =>
`trt_engine_cache_path`
3. `trt_timing_cache_path` provided => `trt_timing_cache_path` (if not
provided `trt_engine_cache_path` will still be workdir)
### Motivation and Context
A TRT timing cache can be reused across multiple models as it only holds
kernel timings and it is common that network "patterns" are reused. This
can accelerate build times a lot.
---------
Co-authored-by: Carson M <carson@pyke.io>