onnxruntime
830a29ee - Fill CUDA EP opset gap for GRU operator (14 → 22) (#27738)

Commit

76 days ago

Fill CUDA EP opset gap for GRU operator (14 → 22) (#27738) ### Description Extends GRU CUDA kernel registration from opset 14 to opset 22, following the same pattern as other recent opset gap fills (e.g., ConvTranspose in #27710). - **`gru.cc`**: Cap existing opset-14 non-versioned kernel to versioned 14–21; add new non-versioned kernel at opset 22+ - **`cuda_execution_provider.cc`**: Update forward declarations and `BuildKernelCreateInfo` entries for versioned 14–21 and non-versioned 22+ - **`deep_cpu_gru_op_test.cc`**: Add CUDA-specific test for GRU at opset 22 with `linear_before_reset=1` (cuDNN requirement) - **`docs/OperatorKernels.md`**: Update CUDA provider GRU entry to reflect `22+`, `[14, 21]`, and `[7, 13]` version ranges No functional changes to the kernel implementation—the GRU spec is unchanged between opsets 14 and 22. ### Motivation and Context CUDA EP registered GRU only up to opset 14, while ONNX defines GRU through opset 22. Models exported at opset ≥15 would fail to find a matching CUDA kernel and fall back to CPU. This is one of the P1 gaps tracked in #27729. ### Limitation BF16 version is not added for GRU-22. It can be added later if needed. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com> Co-authored-by: Tianlei Wu <tlwu@microsoft.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

References

#27738 - Fill CUDA EP opset gap for GRU operator (14 → 22)

Author

Copilot

Parents

b10fa545

onnxruntime 830a29ee - Fill CUDA EP opset gap for GRU operator (14 → 22) (#27738)

onnxruntime
830a29ee - Fill CUDA EP opset gap for GRU operator (14 → 22) (#27738)