[JIT] Add support for tolist for GPU-resident Tensors (#34554)
Summary:
**Summary**
This commit modifies the JIT implementation of `Tensor.tolist` so that it
can be called on GPU-resident Tensors as well. If the Tensors is not on the
CPU when the operator is invoked, it is copied to the CPU before doing any
of the rest of the work to convert it into a list.
**Testing**
This commit adds GPU versions of some of the existing CPU tests for this
feature.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34554
Differential Revision: D20392604
Pulled By: SplitInfinity
fbshipit-source-id: 69c17b98d866428c19d683588046169538aaf1e3