[reland] simplify init_from_local_shards API (#68021)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68021
reland PR of https://github.com/pytorch/pytorch/pull/64481 as the previous one have some internal failures that didn't get captured when first landed.
This simplifies `init_from_local_shards` API in sharded tensor, to only require user pass in a list of `Shard` and `overall_size`, instead of ShardedTensorMetadata. We will do the all_gather inside to form a valid ShardedTensorMetadata instead.
TODO: add more test cases to improve coverage.
ghstack-source-id: 143661119
ghstack-source-id: 143661119
Test Plan: TestShardedTensorFromLocalShards
Reviewed By: pritamdamania87
Differential Revision: D32147888
fbshipit-source-id: 897128b75224f4b9644471a04a64079f51e0d5fe