feature: CpuOffload pre_forward don't attempt to move if already on device (#3695)
* feature: added optimisation to not attempt to move devices if allready on that the device. This is more noticiable in large step itterations on diffusion loops when the pre_froward can get called many times
* fix: linting
* Apply style fixes
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>