Use CUDAGuard when serializing CUDA Tensors (#15807)
Summary:
Fixes #15308. Before this change, `torch.save` and `torch.load` would
initialize the CUDA context on GPU 0 if it hadn't been initialized
already, even if the serialized tensors are only on GPU 1.
This PR fixes that bug by using CUDAGuard in the storage serialization
path.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15807
Differential Revision: D13593201
Pulled By: zou3519
fbshipit-source-id: 4addc91ea5a5278d56a03f3d422577ee39e99897