Do not create numpy on top of Tensor non-owning buffer (#28088)
This pull request improves the safety and correctness of tensor-to-numpy
conversions in the ONNX Runtime Python bindings, specifically addressing
the issue of dangling pointers when model outputs alias input buffers.
It introduces logic to ensure that numpy arrays returned from session
outputs do not share memory with input arrays unless it is safe to do
so, and adds targeted tests to prevent regressions.
**Tensor-to-Numpy Conversion Safety Improvements:**
* Updated the `GetPyObjFromTensor` function signature in both the header
(`onnxruntime_pybind_mlvalue.h`) and implementation
(`onnxruntime_pybind_state.cc`) to accept a new `zero_copy_non_owning`
boolean parameter, allowing explicit control over zero-copy behavior.
[[1]](diffhunk://#diff-e3d44c037b76a79f69ad1c173fe36ef9a84e17962aec9ec0068a16066703f533L132-R133)
[[2]](diffhunk://#diff-c46fc0e05521f706449c04aed599ac0229012c007a78b584519e71a57601d63eL267-R268)
* Enhanced the logic in `GetPyObjFromTensor` so that zero-copy numpy
arrays are only created if the tensor owns its buffer or if
`zero_copy_non_owning` is explicitly set. Otherwise, the data is copied
to prevent use-after-free errors when the original input memory might be
released.
**Device Handling Updates:**
* Modified device-specific code paths in
`onnxruntime_pybind_ortvalue.cc` to always request zero-copy for outputs
from non-CPU devices, ensuring consistent and safe behavior across all
supported hardware backends.
**Testing and Regression Coverage:**
* Added comprehensive tests in `onnxruntime_test_python.py` to verify
that outputs which alias inputs are returned as independent numpy
arrays, preventing data corruption from dangling pointers. Also added a
test to confirm that session-allocated outputs still use efficient
zero-copy numpy arrays.
Addresses issue: https://github.com/microsoft/onnxruntime/issues/21922