llvm-project
6c9ca02f - [mlir][acc] Add ACCSpecializeForDevice and ACCSpecializeForHost passe… (#173527)

Commit
6 days ago
[mlir][acc] Add ACCSpecializeForDevice and ACCSpecializeForHost passe… (#173527) [mlir][acc] Add ACCSpecializeForDevice and ACCSpecializeForHost passes Add two new transformation passes for specializing OpenACC IR for different execution contexts: ACCSpecializeForDevice: - Strips OpenACC constructs that are invalid in device code - Replaces data entry ops with their var operands - Unwraps regions from compute/data constructs - Erases runtime operations (init, shutdown, wait, etc.) This pass is applicable in two contexts: 1. Functions marked with `acc.specialized_routine` attribute, where the entire function body is device code 2. Non-specialized functions, where patterns are applied only to `acc` operations nested inside compute constructs (parallel, serial, kernels), not to the constructs themselves ACCSpecializeForHost: - Converts orphan `acc` operations for host execution - Transforms `acc.atomic.*` to load/store via `PointerLikeType` interface - Converts `acc.loop` to `scf.for` or `scf.execute_region` - Replaces orphan data entry ops with their var operands This pass operates in two modes: 1. Default (orphan) mode: Only converts `acc` operations that are not inside or attached to compute regions. Used for host `acc routine`s where compute constructs should be preserved. 2. Host fallback mode (enable-host-fallback=true): Converts ALL `acc` operations including compute constructs, data regions, and runtime ops. This is used to allow testing of the full conversion. These patterns will be used to handle conditional host execution of `acc` regions with if clause. The pattern population functions (populateACCSpecializeForDevice, populateACCOrphanToHostPatterns, populateACCHostFallbackPatterns) are exposed so other passes can reuse these patterns. --------- Co-authored-by: Susan Tan <zujunt@nvidia.com> Co-authored-by: Scott Manley <rscottmanley@gmail.com>
Parents
Loading