[dtensor] set cuda device automatically, and refactor error handling (#97583)
This PR would detect if device_type is cuda, if cuda passed in,
we would set the current cuda device each process/thread automatically
(This assumption is based on homogenous devices).
Also refactored error handling code
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97583
Approved by: https://github.com/wz337, https://github.com/XilunWu