onnxruntime
f546db86 - Fill GlobalAveragePool and GlobalMaxPool opset gap in CUDA provider (1→22) (#27733)

Commit

41 days ago

Fill GlobalAveragePool and GlobalMaxPool opset gap in CUDA provider (1→22) (#27733) ### Description Extends CUDA kernel registrations for `GlobalAveragePool` and `GlobalMaxPool` from opset 1 only to the full opset 1–22 range. Follows the same pattern used for `MaxPool` in #27715. - **`core/providers/cuda/nn/pool.cc`** — Split single opset-1 registrations into versioned 1–21 + opset 22 for both NCHW and NHWC variants - **`core/providers/cuda/cuda_execution_provider.cc`** — Updated class declarations and `BuildKernelCreateInfo` entries (versioned 1–21, added opset 22) - **`core/providers/cuda/cuda_nhwc_kernels.cc`** — Same for NHWC kernel registrations - **`test/providers/cpu/nn/pool_op_test.cc`** — Added `GlobalAveragePool_22_CUDA` test - **`docs/OperatorKernels.md`** — Updated GlobalAveragePool and GlobalMaxPool entries from `1+` to `22+` / `[1, 21]` in both the ai.onnx and com.microsoft.internal.nhwc domains under CUDAExecutionProvider No functional changes to the kernel implementations—opsets 1 through 22 are spec-compatible for these ops. ### Motivation and Context `GlobalAveragePool` and `GlobalMaxPool` were registered at opset 1 only in the CUDA provider, creating a 21-version gap to the latest ONNX opset 22. Models exported at higher opsets would fail to find a matching CUDA kernel. Identified as P1 gaps in #27729. ### Limitations BF16 support for GlobalAveragePool-22 and GlobalMaxPool-22 is not added in this PR. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com> Co-authored-by: Tianlei Wu <tlwu@microsoft.com>

References

#27733 - Fill GlobalAveragePool and GlobalMaxPool opset gap in CUDA provider (1→22)

Author

Copilot

Parents

405dcd72

onnxruntime f546db86 - Fill GlobalAveragePool and GlobalMaxPool opset gap in CUDA provider (1→22) (#27733)

onnxruntime
f546db86 - Fill GlobalAveragePool and GlobalMaxPool opset gap in CUDA provider (1→22) (#27733)