Fix NuGet DLL Loading on Linux and macOS (#27266)
## Summary
This PR addresses persistent native library loading issues in the ONNX
Runtime NuGet package, specifically on macOS and Linux, by implementing
a robust DllImportResolver. It also includes necessary pipeline and
packaging adjustments to ensure required macOS artifacts are correctly
located and validated during CI.
## Problem
https://github.com/microsoft/onnxruntime/issues/27263 reports that
`Unable to load shared library 'onnxruntime.dll' or one of its
dependencies`. It was caused by
https://github.com/microsoft/onnxruntime/pull/26415 since the commit
hard-coded onnxruntime.dll even for Linux and MacOS (The correct
filename shall be libonnxruntime.so for Linux, and libonnxruntime.dylib
for MacOS).
The Nuget test pipeline has been broken for a while, so we also need fix
the pipeline to test our change. It has the following issues:
* MacOS nuget is for arm64, but the vmImage `macOS-15` is x64.
* MacOS nuget test need libcustom_op_library.dylib, but it is not copied
from artifacts to test environment.
* MacOS artifact contains libonnxruntime.dylib and
libonnxruntime.1.24.1.dylib, where libonnxruntime.dylib is symlink. It
causes issue since the later is excluded by nuspec.
* MacOS nuget test use models from onnx repo. However, latest onnx has
some models with data types like float8 that are not supported by C#, so
those model test failed.
* Linux nuget test uses a docker Dockerfile.package_ubuntu_2404_gpu, but
docker build failed due to libnvinfer-headers-python-plugin-dev and
libnvinfer-win-builder-resource10 version.
## Changes
### 1. Robust C# DLL Resolution
The DllImportResolver has been enhanced to handle various deployment
scenarios where standard .NET resolution might fail:
- **Platform-Specific Naming**: Maps extension-less library names
(`onnxruntime`, `ortextensions`) to appropriate filenames
(`onnxruntime.dll`, `libonnxruntime.so`, `libonnxruntime.dylib`) based
on the OS.
- **Multi-Stage Probing**:
1. **Default Loading**: Attempts `NativeLibrary.TryLoad` with the mapped
name.
2. **NuGet `runtimes` Probing**: If the above fails, it probes the
`runtimes/{rid}/native/` subdirectories relative to the assembly
location, covering common RIDs (`win-x64`, `linux-arm64`, `osx-arm64`,
etc.).
3. **Base Directory Fallback**: As a final attempt, it looks in
`AppContext.BaseDirectory`.
- **Case-Sensitivity Handling**: Ensures lowercase extensions are used
on Windows to prevent lookup failures on case-sensitive filesystems.
### 2. macOS CI/Packaging Improvements
- **Templates (test_macos.yml)**:
- Updated to extract artifacts from TGZ files.
- Ensures `libcustom_op_library.dylib` is placed in the expected
location (`testdata/testdata`) for end-to-end tests.
- Initializes the ONNX submodule to provide required test data.
- **Node.js**:
- Restored the Node.js macOS test stage in
c-api-noopenmp-test-pipelines.yml, configured to run on the ARM64 pool
(`AcesShared`).
- Updated test_macos.yml template to support custom agent pools (similar
to the NuGet template).
- **Pipeline Config**: Adjusted agent pool selection and demands for
macOS jobs to ensure stable execution.
- **Binary Robustness**: The `copy_strip_binary.sh` script now ensures
`libonnxruntime.dylib` is a real file rather than a symlink, improving
NuGet packaging reliability.
### 3. Test Refinements
- **Inference Tests**: Skips a specific set of pretrained-model test
cases on macOS that are currently known to be flaky or unsupported in
that environment, preventing noise in the CI results.
## Verification
### Pipelines
- [x] Verified in `NuGet_Test_MacOS`.
- [x] Verified in `NuGet_Test_Linux`.
- [x] Verified in Windows test pipelines.
### Net Effect
The C# bindings are now significantly more resilient to different
deployment environments. The CI process for macOS is also more robust,
correctly handling the artifacts required for comprehensive NuGet
validation.