llvm
67a641cf - [mlir][acc] Specialize compute region target during ACCComputeLowering (#201386)

Commit
3 days ago
[mlir][acc] Specialize compute region target during ACCComputeLowering (#201386) During ACCComputeLowering, when an acc compute region (such as acc.parallel) is split into in acc.kernel_environment and acc.compute_region, all wait and async operands are transferred over. This includes the multiple lists that are used to manage device_type specific aspects eg: `acc parallel async device_type(nvidia) async(2)` This ends up as `acc.parallel async([#acc.device_type<none>], %c2_i32 : i32 [#acc.device_type<nvidia>])` And similarly, acc.kernel_environment inherited both async aspects. However, during ACCComputeLowering, the pass knows its device_type target. Thus it can directly create a single async (because only async(2) applies when device_type is nvidia): `acc.kernel_environment async(%c2_i32 : i32)` This MR simplifies the operation to not hold all of the multiple lists and updates to ACCComputeLowering pass to ensure to transfer only relevant information. The intent/goal is that the none of the CG operations will hold device_type specific lists.
Parents
Loading