Subtype: some performance tuning. (#56007)
The main motivation of this PR is to fix #55807.
dc689fe8700f70f4a4e2dbaaf270f26b87e79e04 tries to remove the slow
`may_contain_union_decision` check by re-organizing the code path. Now
the fast path has been removed and most of its optimization has been
integrated into the preserved slow path.
Since the slow path stores all inner ∃ decisions on the outer most R
stack, there might be overflow risk.
aee69a41441b4306ba3ee5e845bc96cb45d9b327 should fix that concern.
The reported MWE now becomes
```julia
0.000002 seconds
0.000040 seconds (105 allocations: 4.828 KiB, 52.00% compilation time)
0.000023 seconds (105 allocations: 4.828 KiB, 49.36% compilation time)
0.000026 seconds (105 allocations: 4.828 KiB, 50.38% compilation time)
0.000027 seconds (105 allocations: 4.828 KiB, 54.95% compilation time)
0.000019 seconds (106 allocations: 4.922 KiB, 49.73% compilation time)
0.000024 seconds (105 allocations: 4.828 KiB, 52.24% compilation time)
```
Local bench also shows that 72855cd slightly accelerates
`OmniPackage.jl`'s loading
```julia
julia> @time using OmniPackage
# v1.11rc4
20.525278 seconds (25.36 M allocations: 1.606 GiB, 8.48% gc time, 12.89% compilation time: 77% of which was recompilation)
# v1.11rc4+aee69a4+72855cd
19.527871 seconds (24.92 M allocations: 1.593 GiB, 8.88% gc time, 15.13% compilation time: 82% of which was recompilation)
```