Set and propagate devices in RRef completion future (#58674)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58674
I found this missing parameter while debugging failures in the next PR.
I'm very unhappy about this change. I think this future, which we know for sure won't contain tensors, shouldn't have to worry about CUDA devices. And yet, it does. This means that basically any future anywhere might have to worry about it, and this just doesn't scale, and thus it's bad.
ghstack-source-id: 129567042
Test Plan: Should fix the next diff.
Reviewed By: mrshenli
Differential Revision: D28574083
fbshipit-source-id: 5c89902cdc5cc12f1ebeea860b90cd9c3d7c7da1