[AArch64] Fold zero-high vector inserts in MI peephole optimisation (#182835)
Summary
This patch follows on from #178227.
The previous ISel fold lowers the 64-bit case to:
```
fmov d0, x0
fmov d0, d0
```
which is not ideal and could be fmov d0, x0.
A redundant copy comes from the INSERT_SUBREG/INSvi64lane.
This peephole detects <2 x i64> vectors made of a zeroed upper and low
lane produced by FMOVXDr/FMOVDr, then removes the redundant copy.
Further updated tests and added MIR tests.