onnxruntime
470977a4 - [CoreML EP] Support pre-opset-13 Split via 'split' attribute (#28270)

Commit
7 days ago
[CoreML EP] Support pre-opset-13 Split via 'split' attribute (#28270) ### Description The CoreML `SplitOpBuilder` previously gated `GetMinSupportedOpSet` at 13 because pre-13 `Split` carries split sizes via an INTS attribute rather than a second input. This PR lowers the gate to 1 and reads the attribute in both the MLProgram and NeuralNetwork emitters, so `Split` from any opset is supported on the CoreML EP. The validation in `IsOpSupportedImpl` mirrors the existing input-form rules — ≥2 outputs, sum of sizes equals the axis dim, all sizes positive, axis dim not dynamic. For the no-attribute / no-input case (legacy even-split) we also explicitly require the axis dim to be evenly divisible by `num_outputs`, since CoreML's `num_splits` requires that. This is a behavior change only for opset 2–12 graphs that were 100% rejected before, so no path that used to work regresses. ### Motivation DWPose `dw-ll_ucoco_384.onnx` (opset 11), a common pose-estimation model, has two `Split` nodes — one uneven (`split=[512, 512, 128]`) and one even (`split=[1, 1]`). Both fall back to CPU today, fragmenting the CoreML partition. | | Without this PR | With this PR | |---|---|---| | CoreML partitions | 3 | **1** | | Nodes on CoreML EP | 301 / 303 | **303 / 303** | ### Benchmark — M3 Max, MLProgram, batch 1, 1299-iter steady state | Metric | Without PR | With PR | Δ | |---|---|---|---| | Mean | 6.838 ms | 6.565 ms | −4.0% | | **StdDev** | **0.239 ms** | **0.170 ms** | **−29%** | | P50 | 6.810 ms | 6.545 ms | −3.9% | | P95 | 7.070 ms | 6.775 ms | −4.2% | | P99 | 7.330 ms | 6.928 ms | −5.5% | | P99.9 | 8.917 ms | 8.164 ms | −8.4% | | **Max** | **12.616 ms** | **10.360 ms** | **−17.9%** | Removing the two CPU↔CoreML round trips improves the tail far more than the median — useful for real-time perception pipelines where worst-case latency determines the frame budget. ### Tests Eight new tests in `onnxruntime/test/providers/coreml/coreml_basic_test.cc`, each exercising both the NeuralNetwork and MLProgram emitters and asserting full CoreML EP node assignment (no CPU fallback). **Pre-opset-13 attribute form (the new code path):** - `Split7UnevenAttribute` — opset 7 uneven `split=[4, 3, 2]`, covering the opset 7–10 range. - `Split11UnevenAttribute` — DWPose's pattern, `split=[4, 3, 2]`. - `Split11EvenAttribute` — uniform sizes via attribute. - `Split11NoAttributeEven` — falls through to the `num_splits = num_outputs` branch. **Post-opset-13 input form (parity with the existing, untouched path):** - `Split13UnevenInput` — `split` input `[4, 3, 2]`. - `Split13EvenInput` — uniform sizes via input. - `Split13NoSplitInputEven` — no `split` input, even-split fallback. **Negative coverage:** - `Split11ZeroSplitValueNotSupported` — verifies the attribute-form rejection of a non-positive entry; expects no CoreML assignment. All eight pass locally on macOS 26.3 / M3 Max. ### Motivation for upstreaming Most pre-2023 vision exports (DWPose, MMPose models, original YOLOv5/v7/v8, etc.) target ONNX opset 11/12 and use the `Split` attribute form. They currently lose any `Split` to CPU on the CoreML EP. This is a self-contained gap with a clean fix. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Author
Parents
Loading