llvm-project
671072e8 - [AArch64] Unrolling of loops with vector instructions. (#147420)

Commit
116 days ago
[AArch64] Unrolling of loops with vector instructions. (#147420) This patch permits loops with vector instructions to be unrolled. Today there is an early exit in `getUnrollingPreferences()` of AArch64 targets if a vector instruction is observed in any of the loop blocks. This patch fixes that so common loops like this one get a chance to be unrolled: void saxpy (float * dst, const float * src, const float a, const int len) { float32x4_t * vdst = (float32x4_t *)dst; float32x4_t * vsrc = (float32x4_t *)src; float32x4_t vk = vdupq_n_f32(a); for (int i = 0; i < (len >> 2); i++) { vdst[i] = vaddq_f32(vdst[i], vmulq_f32(vsrc[i], vk)); } } Auto-vectorized loops are still not unrolled, unless they were not interleaved when vectorized. The provided test case shows the enhancement on top of runtime/partial unrolling, depending on the CPU. PR: https://github.com/llvm/llvm-project/pull/147420
Author
Parents
Loading