[Pallas/MGPU] Skip output transfers when they don't depend on sequenital dims
Note that thanks to the previous revisiting-related checks we weren't doing the
transfers anyway, but this way we can also avoid having to pay for the checks.
PiperOrigin-RevId: 679516275