[PowerPC] Merge vsr(vsro(input, byte_shift), bit_shift) to vsrq(input, res_bit_shift) (#154388)
This change implements a patfrag based pattern matching ~dag combiner~
that combines consecutive `VSRO (Vector Shift Right Octet)` and `VSR
(Vector Shift Right)` instructions into a single `VSRQ (Vector Shift
Right Quadword)` instruction on Power10+ processors.
Vector right shift operations like `vec_srl(vec_sro(input, byte_shift),
bit_shift)` generate two separate instructions `(VSRO + VSR)` when they
could be optimised into a single `VSRQ `instruction that performs the
equivalent operation.
```
vsr(vsro (input, vsro_byte_shift), vsr_bit_shift) to vsrq(input, vsrq_bit_shift)
where vsrq_bit_shift = (vsro_byte_shift * 8) + vsr_bit_shift
```
Note:
```
vsro : Vector Shift Right by Octet VX-form
- vsro VRT, VRA, VRB
- The contents of VSR[VRA+32] are shifted right by the number of bytes specified in bits 121:124 of VSR[VRB+32].
- Bytes shifted out of byte 15 are lost.
- Zeros are supplied to the vacated bytes on the left.
- The result is placed into VSR[VRT+32].
vsr : Vector Shift Right VX-form
- vsr VRT, VRA, VRB
- The contents of VSR[VRA+32] are shifted right by the number of bits specified in bits 125:127 of VSR[VRB+32]. 3 bits.
- Bits shifted out of bit 127 are lost.
- Zeros are supplied to the vacated bits on the left.
- The result is place into VSR[VRT+32], except if, for any byte element in VSR[VRB+32], the low-order 3 bits are not equal to the shift amount, then VSR[VRT+32] is undefined.
vsrq : Vector Shift Right Quadword VX-form
- vsrq VRT,VRA,VRB
- Let src1 be the contents of VSR[VRA+32]. Let src2 be the contents of VSR[VRB+32].
- src1 is shifted right by the number of bits specified in the low-order 7 bits of src2.
- Bits shifted out the least-significant bit are lost.
- Zeros are supplied to the vacated bits on the left.
- The result is placed into VSR[VRT+32].
```
---------
Co-authored-by: Tony Varghese <tony.varghese@ibm.com>