ddp+dynamo: assorted fixes
* fix requeue, so that when a job gets preempted, the job can be
resubmitted.
* add ADAM_CAPTURABLE to force using the capturable optimizer
* Assert that torch.version.debug == False, because debug=True is bad
for benchmarking and particularly bad for NCCL performance for some
reason.
ghstack-source-id: c878827d8576a093a5432312a4c9188d4c9d2c18
Pull Request resolved: https://github.com/pytorch/benchmark/pull/1223