[Mosaic GPU] Allow load/store_tiled with refs where the tiling rank is smaller than the logical rank
Previously we assumed they were equal, which meant that we were unable to e.g. store into a 3D
ref that only used 2D tiling. `transfer_tiled` is perfectly capable of synthesizing those kinds of
schedules. It was enough to slightly change the way we construct the ref tiling objects.
PiperOrigin-RevId: 858176765