[dtensor] simplify outputs wrapping handling (#120297)
This PR simplifies the outputs wrapping handling in op dispatch, to make
it simpler and easier to understand.
It also enables a new case, where if the output DTensorSpec for the res is
None, and the res is a scalar tensor, we will just return the scalar
tensor instead of wrapping it with a DTensor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120297
Approved by: https://github.com/wz337