llvm-project
c985f285 - [OMPIRBuilder] Hoist alloca's to entry blocks of compiler-emitted GPU reduction functions (#181359)

Commit
1 day ago
[OMPIRBuilder] Hoist alloca's to entry blocks of compiler-emitted GPU reduction functions (#181359) Fixes a bug in GPU reductions when `-O0` was used to compile GPU reductions. There were invalid memory accesses at runtime for the following example: ```fortran program test_array_reduction() integer :: red_array(1) integer :: i red_array = 0 !$omp target teams distribute parallel do reduction(+:red_array) do i = 1, 100 red_array(1) = red_array(1) + 4422 end do !$omp end target teams distribute parallel do print *, red_array end program test_array_reduction ``` The issue was caused by alloca's for some temp values in the combiner region of the reduction op being inlined beyond the entry blocks of the GPU reduction functions emitted by the compiler. This PR fixes the issue by hoisting all alloca's to the entry block after the reduction functions are completely emitted by the compiler.
Author
Parents
Loading