Make sure that we can load old optimizer checkpoint (#83588)
We want to make sure that we can load checkpoints that were saved with older version of the code (which doesn't contain the differentiable attribute).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83588
Approved by: https://github.com/mikaylagawarecki