Use a FutureFactoryRegistry to allow libtorch_cpu files to create CUDAFuture (#56984)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56984
This is a preparation PR before we can create CUDAFuture in rref_impl.cpp.
The solution is adding a `FutureFactoryRegistry` in `rpc/utils.*`. The
TensorPipe RPC agent is responsible for registering `CUDAFuture` factory
and `ivalue::Future` factory. The reason that we need this change instead
of directly using `USE_CUDA` macro in RRef files is as follows. There are
three build targets: `torch_cpu`, `torch_cuda`, and `torch_python`.
`torch_python` is built on top of the other two. `torch_cpu` is CPU-only,
which contains no CUDA-related code, and hence no `USE_CUDA` macro.
`tensorpipe_*` files are in `torch_python` which does have access to CUDA.
However RRef source files are in `torch_cpu`, which cannot contain CUDA
code. The recommended solution is to allow dynamic dispatching. Therefore,
we had this PR.
Test Plan: Imported from OSS
Reviewed By: lw
Differential Revision: D28020917
Pulled By: mrshenli
fbshipit-source-id: e67c76a273074aebb61877185cc5e6bc0a1a5448