Fix DDPOptimizer fake_mode execution (#92986)
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom):
* __->__ #92986
When running compiled submods for the purpose of producing outputs to pass
to the compilation step for the next submod, we use fake parameters and
assume fake inputs, but we forgot to activate our fake_mode during execution.
This caused certain edge cases where tensors other than activations or parameters
got created during execution, such as scalar->tensor expansion in the case
of executing torch.where(tensor, scalar, scalar).
Also add a test and clarify behavior of DDPOptimizer via comments.
Fixes #92941
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92986
Approved by: https://github.com/bdhirsh