[mlir][SCFToGPU] Fix crash when converting affine.for with iter_args to GPU (#185073)
The convert-affine-for-to-gpu pass moved operations from the affine.for
loop body to the GPU launch kernel, then erased the original loop.
However, if the loop had iter_args (reduction loops), the moved
operations could still reference the loop body's block arguments (the
iter_args). When the loop was erased, those block arguments were
destroyed while still having live uses, triggering a use_empty()
assertion.
Fix this by detecting loops with iter_args in collectBounds and
returning an error. Reduction loops cannot be trivially converted to GPU
kernels without dedicated handling of the accumulator semantics.
Fixes #116044
Assisted-by: Claude Code