[MLIR][XeGPU] Lowering 2-Dimensional Reductions of N-D Tensors into Chained 1-D Reductions (#186034)
This PR relaxes the 2d reduction lowering in the peephole optimization
pass to allow source tensor to have n-d shape.
It also fixes a minor bug of accumulator lowering in the current
implementation.