[Mosaic GPU] Support Slice and Transpose in the Pallas WGMMA lowering
This change also fixes the transpose handling in the lowering and completely removes the use of the TransposeTransform. Instead we rely on strides. If we don't discover any issues with this, we will remove the transpose transform also from the mlir dialect.
PiperOrigin-RevId: 744618241