[AArch64][SVE] Pair SVE fill/spill into LDP/STP with -msve-vector-bits=128. (#134068)
When compiling with -msve-vector-bits=128 or vscale_range(1, 1) and when
the offsets allow it, we can pair SVE LDR/STR instructions into Neon
LDP/STP.
For example, given:
```cpp
#include <arm_sve.h>
void foo(double const *ldp, double *stp) {
svbool_t pg = svptrue_b64();
svfloat64_t ld1 = svld1_f64(pg, ldp);
svfloat64_t ld2 = svld1_f64(pg, ldp+svcntd());
svst1_f64(pg, stp, ld1);
svst1_f64(pg, stp+svcntd(), ld2);
}
```
When compiled with `-msve-vector-bits=128`, we currently generate:
```gas
foo:
ldr z0, [x0]
ldr z1, [x0, #1, mul vl]
str z0, [x1]
str z1, [x1, #1, mul vl]
ret
```
With this patch, we instead generate:
```gas
foo:
ldp q0, q1, [x0]
stp q0, q1, [x1]
ret
```
This is an alternative, more targetted approach to #127500.