llvm-project
3fc1aad6 - [PowerPC] Merge vsr(vsro(input, byte_shift), bit_shift) to vsrq(input, res_bit_shift) (#154388)

Commit
28 days ago
[PowerPC] Merge vsr(vsro(input, byte_shift), bit_shift) to vsrq(input, res_bit_shift) (#154388) This change implements a patfrag based pattern matching ~dag combiner~ that combines consecutive `VSRO (Vector Shift Right Octet)` and `VSR (Vector Shift Right)` instructions into a single `VSRQ (Vector Shift Right Quadword)` instruction on Power10+ processors. Vector right shift operations like `vec_srl(vec_sro(input, byte_shift), bit_shift)` generate two separate instructions `(VSRO + VSR)` when they could be optimised into a single `VSRQ `instruction that performs the equivalent operation. ``` vsr(vsro (input, vsro_byte_shift), vsr_bit_shift) to vsrq(input, vsrq_bit_shift) where vsrq_bit_shift = (vsro_byte_shift * 8) + vsr_bit_shift ``` Note: ``` vsro : Vector Shift Right by Octet VX-form - vsro VRT, VRA, VRB - The contents of VSR[VRA+32] are shifted right by the number of bytes specified in bits 121:124 of VSR[VRB+32]. - Bytes shifted out of byte 15 are lost. - Zeros are supplied to the vacated bytes on the left. - The result is placed into VSR[VRT+32]. vsr : Vector Shift Right VX-form - vsr VRT, VRA, VRB - The contents of VSR[VRA+32] are shifted right by the number of bits specified in bits 125:127 of VSR[VRB+32]. 3 bits. - Bits shifted out of bit 127 are lost. - Zeros are supplied to the vacated bits on the left. - The result is place into VSR[VRT+32], except if, for any byte element in VSR[VRB+32], the low-order 3 bits are not equal to the shift amount, then VSR[VRT+32] is undefined. vsrq : Vector Shift Right Quadword VX-form - vsrq VRT,VRA,VRB - Let src1 be the contents of VSR[VRA+32]. Let src2 be the contents of VSR[VRB+32]. - src1 is shifted right by the number of bits specified in the low-order 7 bits of src2. - Bits shifted out the least-significant bit are lost. - Zeros are supplied to the vacated bits on the left. - The result is placed into VSR[VRT+32]. ``` --------- Co-authored-by: Tony Varghese <tony.varghese@ibm.com>
Author
Parents
Loading