[reland] Set and propagate devices in RRef completion future (#59211)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59211
Reland of https://github.com/pytorch/pytorch/pull/58674
I found this missing parameter while debugging failures in the next PR.
I'm very unhappy about this change. I think this future, which we know for sure won't contain tensors, shouldn't have to worry about CUDA devices. And yet, it does. This means that basically any future anywhere might have to worry about it, and this just doesn't scale, and thus it's bad.
ghstack-source-id: 130202843
Test Plan: Should fix the next diff.
Reviewed By: mrshenli
Differential Revision: D28623886
fbshipit-source-id: 6c82ed7c785ac3bf32fff7eec67cdd73b96aff28