[PyTorch] Remove call_once from CUDACachingAllocator (#71668)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71668
As https://en.cppreference.com/w/cpp/thread/call_once mentions, function-local statics are probably more efficient.
ghstack-source-id: 148013646
Reviewed By: ngimel
Differential Revision: D33722954
fbshipit-source-id: a2737c2d6dfdd23b26cbe34574b80e3da0d4b8a4
(cherry picked from commit a6ddb24558f41aff12f76ba49a28d0a3082aec20)