[dtensor] skip move to device when device_type match (#110774)
skip tensor.to in from_local and distribute_tensor when device_type of
device mesh matches tensor.device type, since from_local on the critial
path of TP, this might also reduce some overhead
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110774
Approved by: https://github.com/fduwjj