llvm-project
af3c3ecb - [AArch64] recognise trn1/trn2 with flipped operands (#169858)

Commit
24 days ago
[AArch64] recognise trn1/trn2 with flipped operands (#169858) This PR is very similar to #167235, but applied to `trn` rather than `zip`. There are two further differences: - The `@combine_v8i16_8first` and `@combine_v8i16_8firstundef` test cases in `arm64-zip.ll` didn't have equivalents in `arm64-trn.ll`, so this PR adds new test cases `@vtrni8_8first`, `@vtrni8_9first`, `@vtrni8_89first_undef`. - `AArch64TTIImpl::getShuffleCost` calls `isZIPMask`, but not `isTRNMask`. It relies on `Kind == TTI::SK_Transpose` instead (which in turn is based on `ShuffleVectorInst::isTransposeMask` through `improveShuffleKindFromMask`). Therefore, this PR does not itself influence the slp-vectorizer. In a follow-up PR, I intend to override `AArch64TTIImpl::improveShuffleKindFromMask` to ensure we get `ShuffleKind::SK_Transpose` based on the new `isTRNMask`. In fact, that follow-up change is the actual motivation for this PR, as it will result in ```C++ int8x16_t g(int8_t x) { return (int8x16_t) { 0, x, 1, x, 2, x, 3, x, 4, x, 5, x, 6, x, 7, x }; } ``` from #137447 being optimised by the slp-vectorizer.
Author
Parents
Loading