fix: make sympy an optional runtime dependency (#28141)
## Summary
- Defer `sympy` import so `import onnxruntime.quantization` succeeds
without sympy installed
- Move `SymbolicShapeInference` import in `quant_pre_process` behind
`skip_symbolic_shape` gate
- Defer sympy-dependent imports in `transformers.onnx_model` and
`transformers.shape_infer_helper`
- Raise a clear, actionable `ImportError` instructing users to install
sympy when needed
## Motivation
Fixes #24872. `sympy` (~29 MB plus `mpmath` ~2 MB) was a hard runtime
dependency even though it is only needed for symbolic shape inference.
Pure-inference users — the common case — pay the install/import cost for
functionality they do not use. `setup.py` already declares sympy as an
optional extra (`"symbolic": ["sympy"]`), but top-level imports forced
it to load unconditionally.
## Changes
- `onnxruntime/python/tools/quantization/shape_inference.py`: move `from
onnxruntime.tools.symbolic_shape_infer import SymbolicShapeInference`
from module top-level into `quant_pre_process`, guarded by `if not
skip_symbolic_shape`. Wrap in `try/except ImportError` that re-raises
with install instructions.
- `onnxruntime/python/tools/transformers/onnx_model.py`: move the `from
shape_infer_helper import SymbolicShapeInferenceHelper` from module
top-level into the two methods that instantiate it. Add
`TYPE_CHECKING`-guarded import for type annotations.
- `onnxruntime/python/tools/transformers/shape_infer_helper.py`: wrap
the import of `symbolic_shape_infer` in `try/except ImportError`. The
`SymbolicShapeInferenceHelper.__init__` now raises a clear `ImportError`
when sympy is unavailable, instead of failing at module load time.
- `onnxruntime/test/python/quantization/test_quant_preprocess.py`: add
`test_skip_symbolic_shape_does_not_require_sympy` which removes sympy
from `sys.modules` and verifies `quant_pre_process(...,
skip_symbolic_shape=True)` completes successfully.
No public API signatures change. Users who want symbolic shape inference
install sympy as before (`pip install sympy` or `pip install
onnxruntime[symbolic]`).
## Test Plan
- `python -m pytest
onnxruntime/test/python/quantization/test_quant_preprocess.py -v` — all
tests pass including the new coverage.
- Smoke-tested locally: `import onnxruntime.quantization` no longer
pulls `sympy` into `sys.modules`.
- `lintrunner -a` clean on all changed files.
Fixes #24872