llvm-project
82512937 - [VPlan] Move tail folding out of VPlanPredicator. NFC (#176143)

Commit
47 days ago
[VPlan] Move tail folding out of VPlanPredicator. NFC (#176143) Currently the logic for introducing a header mask and predicating the vector loop region is done inside introduceMasksAndLinearize. This splits the tail folding part out into an individual VPlan transform so that VPlanPredicator.cpp doesn't need to worry about tail folding, which seemed to be a temporary measure according to a comment in VPlanTransforms.h. To perform tail folding independently, this splits the "body" of the vector loop region between the phis in the header and the branch + iv increment in the latch: Before: ``` +-------------------------------------------+ |%iv = ... | |... | |%iv.next = add %iv, vfxuf | |branch-on-count %iv.next, vector-trip-count| +-------------------------------------------+ ``` After: ``` +-------------------------------------------+ |%iv = ... | |%wide.iv = widen-canonical-iv ... | |%header-mask = icmp ule %wide.iv, BTC |---+ |branch-on-cond %header-mask | | +-------------------------------------------+ | | | v | +-------------------------------------------+ | |... | | +-------------------------------------------+ | | | v | +-------------------------------------------+ | |%iv.next = add %iv, vfxuf |<--+ |branch-on-count %iv.next, vector-trip-count| +-------------------------------------------+ ``` Phis are then inserted in the latch for any value in the loop body that have outside uses, with poison as their incoming value from the header edge. The motivation for this is to allow us to share the same "predicate all successor blocks" type of predication we do for tail folding, but for early-exit loops in #172454. This may also allow us to directly emit an EVL based header mask, instead of having to match + transform the existing header mask in addExplicitVectorLength. This also allows us to eventually handle recurrences in the same transform, avoiding the need to special case tail folding in addReductionResultComputation.
Author
Parents
Loading