[VectorCombine] Refine cost model and decision logic in foldSelectShuffle (#146694)
After PR #136329, shuffle indices may differ, which can cause the
existing cost-based logic to miss optimisation opportunities for
binop/shuffle sequences.
This patch improves the cost model in foldSelectShuffle to more
accurately assess costs, recognising when certain duplicate shuffles do
not require actual instructions.
Additionally, in break-even cases, this change introduces a check for
whether the pattern ultimately feeds into a vector reduction, allowing
the transform to proceed when it is likely to be profitable overall.