Flux2: Tensor tuples can cause issues for checkpointing (#12777)
* split tensors inside the transformer blocks to avoid checkpointing issues
* clean up, fix type hints
* fix merge error
* Apply style fixes
---------
Co-authored-by: s <you@example.com>
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>