[VectorCombine] foldShuffleChainsToReduce - add support for partial vector reductions (#195119)
Extend foldShuffleChainsToReduce to recognize partial reduction patterns where only a subvector of the full vector is being reduced.
For example, a <16 x i16> vector where the shuffle chain only reduces the lower 8 elements can now be folded into:
shufflevector (extract lower <8 x i16>) + vector.reduce.smax
The detection works by noticing when the bottom-up walk through the
shuffle/op chain ends before consuming the full vector. The number of
levels visited determines the subvector size (2^levels), and an
extract_subvector + scalar reduction replaces the original chain when
profitable.
Fixes #194617