[mlir][xegpu] Refine layout assignment in XeGPU SIMT distribution. (#142687)
Changes:
* Decouple layout propagation from subgroup distribution and move it to
an independent pass.
* Refine layout assignment to handle control-flow ops correctly (scf.for, scf.while).
* Refine test cases.