Move CUDA-related stuff of TP agent to separate file (#59377)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59377
This PR demonstrates that now the CUDA parts of the TensorPipe agent just "plug on top" of the CPU-only parts. Thus ideally the CPU-only parts could go in libtorch while the CUDA-only parts could go in libtorch_cuda. Unfortunately we can't do that just yet, because the TensorPipe agent depends on c10d (for its Store and its ProcessGroup), which lives in libtorch_python.
ghstack-source-id: 131326168
Test Plan: CI
Reviewed By: cbalioglu
Differential Revision: D28796429
fbshipit-source-id: 41b2eb8400c0da282f3750a4eea21ad83ee4a175