[AArch64][MachineCombiner] Combine sequences of gather patterns (#152979)
Reland of #142941
Squashed with fixes for #150004, #149585
This pattern matches gather-like patterns where
values are loaded per lane into neon registers, and
replaces it with loads into 2 separate registers, which
will be combined with a zip instruction. This decreases
the critical path length and improves Memory Level
Parallelism.
rdar://151851094