[Mosaic GPU] Run a standard mlir cse pass before Mosaic layout inference.
There are cases where layout inference fails with unused `vector.load` ops. This CL adds a pass to remove these. The unused ops are the result of lowering expressions like `o[...] = a[...] + b[...]` where the lowering goes through `swap` and always return the old value of `o` even if it's not used.
PiperOrigin-RevId: 754892752