Fill CUDA EP opset gaps for Round and Equal operators (#27754)
### Description
Caps existing non-versioned CUDA kernel registrations and adds new
registrations at the latest ONNX opset:
- **Round**: opset 11 (non-versioned) → versioned 11–21 + new opset 22
- **Equal**: opset 13 (non-versioned) → versioned 13–18 + new opset 19
Changes across three files:
- `unary_elementwise_ops.cc` — `UNARY_OP_HFD(Round, 11)` →
`UNARY_OP_VERSIONED_HFD` + `UNARY_OP_HFD`
- `binary_elementwise_ops.cc` —
`BINARY_LOGICALOP_REGISTER_UZILHFD(Equal, 13)` → versioned 13–18 + new
19 (same for `bool` typed registration)
- `cuda_execution_provider.cc` — corresponding forward declarations and
`BuildKernelCreateInfo` entries
No type changes; both operators retain their existing CUDA type support
at the new opsets.
### Motivation and Context
Tracks with the ongoing effort to close ONNX opset coverage gaps in the
CUDA execution provider
(https://github.com/microsoft/onnxruntime/issues/27729). Without these
registrations, models targeting opset 19+ (Equal) or 22+ (Round) fall
back from CUDA to CPU.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
Co-authored-by: Tianlei Wu <tlwu@microsoft.com>