Fill RNN CUDA operator opset gap (14 → 22) (#27743)
### Description
Extends RNN CUDA kernel registration from opset 14 to opset 22,
following the standard opset gap-filling pattern:
- **`rnn.cc`**: Cap existing opset 14 non-versioned kernel to versioned
14–21; add new non-versioned kernel at opset 22
- **`cuda_execution_provider.cc`**: Update forward declarations and
`BuildKernelCreateInfo` entries to match (versioned 14–21 +
non-versioned 22); remove duplicate GRU opset 22 entries introduced
during merge
- **`OperatorKernels.md`**: Update CUDA RNN entry to reflect three
tiers: `[7,13]`, `[14,21]`, `22+`
No behavioral changes — the operator implementation is identical across
opset 14–22. This is a registration-only change.
### Motivation and Context
RNN CUDA operator was registered at opset 14 while ONNX defines it
through opset 22, causing models exported at newer opsets to fall back
to CPU. Part of the broader CUDA EP opset gap effort tracked in #27729.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
Co-authored-by: Tianlei Wu <tlwu@microsoft.com>