CUDA EP vs ROCM EP hipify audit (#17776)
Migrate most CUDA EP improvements and changes to ROCM EP. The process
involves using hipify against all CUDA EP files (i.e. do not exclude any
files from onnxruntime_rocm_hipify.cmake) then vimdiff compare them
against the ROCM EP files that are under source control and pull in most
changes. These changes include functional as well as formatting and
makes comparing CUDA EP and ROCM EP easier, though it makes the PR diff
somewhat less obvious due to formatting changes.
- hipify audit of onnxruntime/core/providers/rocm, enable ops
- Loop
- Scan
- hipify audit of onnxruntime/contrib_ops/rocm
- fix contrib ops search implementation
- enable more contrib ops
- Affine
- ComplexMul
- ConvTransposeWithDynamicPads
- Crop
- DynamicSlice
- FFT [Rfft, Irfft]
- GreedySearch
- ImageScaler
- ParametricSoftplus
- ScaledTanh
- ThresholdRelu
---------
Co-authored-by: cloudhan <cloudhan@outlook.com>