onnxruntime
2da1a300 - Fix RoiAlign heap out-of-bounds read via unchecked batch_indices (#27543)

Commit
6 days ago
Fix RoiAlign heap out-of-bounds read via unchecked batch_indices (#27543) # Fix RoiAlign heap out-of-bounds read via unchecked batch_indices ## Description Add value-range validation for `batch_indices` in the RoiAlign operator to prevent out-of-bounds heap reads from maliciously crafted ONNX models. `CheckROIAlignValidInput()` previously validated tensor shapes but never checked that the **values** in `batch_indices` fall within `[0, batch_size)`. An attacker could supply `batch_indices` containing values exceeding the batch dimension of the input tensor `X`, causing the kernel to read arbitrary heap memory at: - **CPU:** `roialign.cc:212` — `roi_batch_ind` used as unchecked index into `bottom_data` - **CUDA:** `roialign_impl.cu:109` — `batch_indices_ptr[n]` used as unchecked index into `bottom_data` on GPU ## Impact - **Vulnerability type:** Heap out-of-bounds read - **Impact:** Arbitrary heap memory read, potential information disclosure, program crash - **Trigger:** Construct `batch_indices` with values ≥ `batch_size` or < 0 - **Affected providers:** CPU and CUDA (both call `CheckROIAlignValidInput()`) ## Changes ### `onnxruntime/core/providers/cpu/object_detection/roialign.cc` - Added per-element bounds check in `CheckROIAlignValidInput()`: each `batch_indices[i]` must satisfy `0 <= value < X.shape[0]` - Returns `INVALID_ARGUMENT` with a descriptive error message on violation - Guarded by `batch_indices_ptr->Location().device.Type() == OrtDevice::CPU` so it only runs when the tensor data is host-accessible (CPU EP and CropAndResize). For the CUDA EP, `batch_indices` lives in GPU memory and cannot be safely dereferenced on the host. ### `onnxruntime/test/providers/cpu/object_detection/roialign_test.cc` - Added `BatchIndicesOutOfRange` test: `batch_indices={1}` with `batch_size=1` (exercises `>= batch_size` path) - Added `BatchIndicesNegative` test: `batch_indices={-1}` (exercises `< 0` path) ## Known Limitation The CUDA execution path is **not** protected by this bounds check because `batch_indices` is a GPU tensor and cannot be read on the host. Adding a device-side bounds check would require passing `batch_size` into the CUDA kernel — this is tracked as a follow-up. Note: Using `.InputMemoryType(OrtMemTypeCPUInput, 2)` was considered but rejected because it would force a GPU→CPU transfer of `batch_indices`, breaking CUDA graph capture for models like Masked R-CNN where `batch_indices` is produced by upstream GPU ops. ## Validation - Full `RoiAlignTest.*` suite passes (12/12 tests) on CPU build - Full `RoiAlignTest.*` suite passes (12/12 tests) on CUDA build - No regressions in existing positive or negative tests
Author
Parents
Loading