[mlir][acc] Specialize compute region target during ACCComputeLowering (#201386)
During ACCComputeLowering, when an acc compute region (such as
acc.parallel) is split into in acc.kernel_environment and
acc.compute_region, all wait and async operands are transferred over.
This includes the multiple lists that are used to manage device_type
specific aspects eg:
`acc parallel async device_type(nvidia) async(2)`
This ends up as
`acc.parallel async([#acc.device_type<none>], %c2_i32 : i32
[#acc.device_type<nvidia>])`
And similarly, acc.kernel_environment inherited both async aspects.
However, during ACCComputeLowering, the pass knows its device_type
target. Thus it can directly create a single async (because only
async(2) applies when device_type is nvidia): `acc.kernel_environment
async(%c2_i32 : i32)`
This MR simplifies the operation to not hold all of the multiple lists
and updates to ACCComputeLowering pass to ensure to transfer only
relevant information. The intent/goal is that the none of the CG
operations will hold device_type specific lists.