[ty] Use distributed versions of AND and OR on constraint sets (#22614)
There are some pathological examples where we create a constraint set
which is the AND or OR of several smaller constraint sets. For example,
when calling a function with many overloads, where the argument is a
typevar, we create an OR of the typevar specializing to a type
compatible with the respective parameter of each overload.
Most functions have a small number of overloads. But there are some
examples of methods with 15-20 overloads (pydantic, numpy, our own
auto-generated `__getitem__` for large tuple literals). For those cases,
it is helpful to be more clever about how we construct the final result.
Before, we would just step through the `Iterator` of elements and
accumulate them into a result constraint set. That results in an `O(n)`
number of calls to the underlying `and` or `or` operator — each of which
might have to construct a large temporary BDD tree.
AND and OR are both associative, so we can do better! We now invoke the
operator in a "tree" shape (described in more detail in the doc
comment). We still have to perform the same number of calls, but more of
the calls operate on smaller BDDs, resulting in a much smaller amount of
overall work.