llvm-project
f59b5b8d - [MLIR][OpenMP] Fix standalone distribute on the device (#133094)

Commit
1 year ago
[MLIR][OpenMP] Fix standalone distribute on the device (#133094) This patch updates the handling of target regions to set trip counts and kernel execution modes properly, based on clang's behavior. This fixes a race condition on `target teams distribute` constructs with no `parallel do` loop inside. This is how kernels are classified, after changes introduced in this patch: ```f90 ! Exec mode: SPMD. ! Trip count: Set. !$omp target teams distribute parallel do do i=... end do ! Exec mode: Generic-SPMD. ! Trip count: Set (outer loop). !$omp target teams distribute do i=... !$omp parallel do private(idx, y) do j=... end do end do ! Exec mode: Generic-SPMD. ! Trip count: Set (outer loop). !$omp target teams distribute do i=... !$omp parallel ... !$omp end parallel end do ! Exec mode: Generic. ! Trip count: Set. !$omp target teams distribute do i=... end do ! Exec mode: SPMD. ! Trip count: Not set. !$omp target parallel do do i=... end do ! Exec mode: Generic. ! Trip count: Not set. !$omp target ... !$omp end target ``` For the split `target teams distribute + parallel do` case, clang produces a Generic kernel which gets promoted to Generic-SPMD by the openmp-opt pass. We can't currently replicate that behavior in flang because our codegen for these constructs results in the introduction of calls to the `kmpc_distribute_static_loop` family of functions, instead of `kmpc_distribute_static_init`, which currently prevent promotion of the kernel to Generic-SPMD. For the time being, instead of relying on the openmp-opt pass, we look at the MLIR representation to find the Generic-SPMD pattern and directly tag the kernel as such during codegen. This is what we were already doing, but incorrectly matching other kinds of kernels as such in the process.
Author
Parents
Loading