Make VariableVersion::DISABLED the default constructor for VariableVersion. (#55572)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55572
We used to have VariableVersion default constructor
`VariableVersion(uint32_t version=0)`. But sometimes
we override the version_counter right after it's constructed.
E.g in SavedVariable/TensorImpl.
Thus we should make DISABLED the default constructor and else
where using explicit `VariableVersion(uint32_t)` constructor.
Note this PR effectively changes SavedVariable constructor (which overrides
version_counter_ inside) to use the DISABLED constructor and we
can see the gains in reduced instruction counts.
```
// benchmark code
timer = Timer(
"y = x * x",
"""
x = torch.rand((3, 3)).requires_grad_()
""",
language=Language.PYTHON,
)
λ ~ python compare.py
No CUDA runtime is found, using CUDA_HOME='/public/apps/cuda/10.2'
<torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts
object at 0x7f06c48b3a50>
7236 lookdict_unicode_nodummy
2600 torch::autograd::VariableType::(...)
100 0x0000000017751750
-5 unlink_chunk.isra.0
-100 0x000000001773e750
-402 _int_malloc
-1600 operator delete(...)
-1600 c10::intrusive_ptr_target::release_resources()
-2400 c10::VariableVersion::VersionCounter::~VersionCounter()
-3600 torch::autograd::SavedVariable::operator=(...)
-4800 operator new(...)
-6400 torch::autograd::SavedVariable::SavedVariable(...)
-7200 torch::autograd::SavedVariable::SavedVariable()
-8400 free
-16800 malloc
-24400 _int_free
Total: -67771
```
Note there're for other callsites(esp. view related) we just keep it unchanged by
explicitly calling `VariableVersion(uint32_t)` but we should be
able to optimize those in the followup PRs.
Test Plan: Imported from OSS
Reviewed By: navahgar
Differential Revision: D27669074
Pulled By: ailzhang
fbshipit-source-id: a4deb297cc89142ae8bd683284516c881ddf3c87