Make DataPtr extraction in CUDAFuture faster for Python values (#56918)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56918
Re-importing a Python module each time is a bit expensive, and it's unnecessary because this is a private module which won't change and thus we can cache the value once we first extract it.
ghstack-source-id: 128184666
Test Plan: CI
Reviewed By: mrshenli
Differential Revision: D27985910
fbshipit-source-id: be40ae9b67ab8ea6c07bc2cb9a78d2c2c30b35d3