ruff
a33be006 - [ty] Make inferred specializations line up with source types more better (#23715)

Commit
30 days ago
[ty] Make inferred specializations line up with source types more better (#23715) I discovered this while looking into some unexpected ecosystem results on https://github.com/astral-sh/ruff/pull/22578. It took a couple of days of wrestling with pi to discover the issue and reduce it to the following example: ```py def f(l: list[tuple[Any | str, Any | str]]) -> None: # was: dict[str | Any, str | Any] # should be: dict[Any | str, Any | str] reveal_type(dict(l)) ``` We are trying to infer a specialization of `dict` when we pass in the `list[tuple[...]]` type an argument. This matches against the constructor overload that takes in an `Iterator[tuple[_KT, _VT]]`. All good so far. As part of doing that, we create the constraint set `(Any | str ≤ _KT) ∧ (Any | str ≤ _VT)`, which has the following structure: ``` <0> (str ≤ _VT@dict) 4/4 ┡━₁ <1> (Any ≤ _VT@dict) 3/3 │ ┡━₁ <2> (str ≤ _KT@dict) 2/2 │ │ ┡━₁ <3> (Any ≤ _KT@dict) 1/1 │ │ │ ┡━₁ always │ │ │ └─₀ never │ │ └─₀ never │ └─₀ never └─₀ never ``` A few interesting things have happened: - Both typevars correctly have lower bound (contravariant) constraints: function parameters are contravariant; `Iterator` is covariant; contra × co = contra - The constraint set lines up with "source order": the type explicitly mentioned in the code is `Any | str`, and `_KT` appears before `_VT` in the definition of `dict`. - We break apart lower bound unions: e.g. `Any | str ≤ _KT` becomes `(Any ≤ _KT) ∧ (str ≤ _KT)` - That gives us four "atomic" constraints, which we correctly create in source order: `Any ≤ _KT`, `str ≤ _KT`, `Any ≤ _VT`, `str ≤ _VT`. (The `1/1` etc in the graph rendering is the `source_order` that we have assigned to each constraint.) - Our BDDs use the _reverse_ of the constraint ID ordering, so the constraint with the smallest ID (`Any ≤ _KT`) appears closest to the leaf nodes. (This is on purpose for efficiency reasons) Everything described so far is correct and expected behavior. Next we have to find a solution for this constraint set. To do that, we find every path from the root to the `always` terminal, remembering all of the constraints that we encounter along that path. In this case, there is only one path: ``` str ≤ _VT (4) Any ≤ _VT (3) str ≤ _KT (2) Any ≤ _KT (1) ``` Those constraints are still in reverse order! But we're tracking a `source_order` for each constraint, and we [sort by `source_order`](https://github.com/astral-sh/ruff/blob/149c5786ca6ae135431b86d1cd8ab74763b9a95a/crates/ty_python_semantic/src/types/constraints.rs#L3042-L3051) before turning this list of constraints into a solution. That _should_ mean that we build up the unions as `Any | str`. So why were we seeing `str | Any` instead? If you print out the `source_order`s that we actually see when we get to the sort, we have: ``` str ≤ _VT (4) Any ≤ _VT (4) str ≤ _KT (2) Any ≤ _KT (2) ``` The `Any` constraints have copied their `source_order`s from the corresponding `str` constraints! What has happened is that some purposeful behavior has engaged too aggressively. When we look at a path that represents a solution, we want to know _all_ of the constraints that are true for that path — not just the ones that are explicitly mentioned in the BDD. We have a `SequentMap` type that records the relationships between constraints. Among other things, it records _implications_: "when X is true, Y is true as well". As we build up potential paths, each time we add one of the constraints from the BDD, we immediately check the sequent map to see if we now know any other _derived_ facts to be true. If so, we add them to the path. And importantly, we want those derived facts to appear "near" their "origin" constraint — so we give the derived constraint the same `source_order` as the BDD constraint we just added. In this case, the first constraint we encounter in the BDD is `str ≤ _VT (4)`. The sequent map tells us that `str ≤ _VT` implies `Any ≤ _VT`, so we add `Any ≤ _VT` as a _derived_ constraint, which inherits `source_order=4` from its origin constraint. Next we encounter `Any ≤ _VT` as an _origin_ constraint (since it appears in the BDD). But since it's already in the path, we don't add it as a duplicate or update its `source_order`. The fix is straightforward: origin `source_order`s should take precedence over inherited derived `source_order`s.
Author
Parents
Loading