[shard] Sharder and ShardingPlan prototype (#73873)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73873
Basic ShardingPlan interface and Sharder implemention:
1. We provide `ShardingPlan` to allow user to specify all parameter sharding strategies for a given model, this including `plan` for sharding the parameters, and `output_plan` for tagging the output layout, `return_local_tensor` for converting back to DDP.
2. Introduce `shard_module` API, that could take a nn.Module, a ShardingPlan, then shard the module according to the plan.
TODO:
next PR we will introduce Extensible Sharder and ShardingPlanner.
ghstack-source-id: 154682421
Test Plan: test_sharding_plann.py
Reviewed By: pritamdamania87, fduwjj
Differential Revision: D34695159
fbshipit-source-id: 3d695803c4b7e9a7543177ade5b709b5f847baa9
(cherry picked from commit 670cd279b0e5304a9bf0ce6e6651a08273a77035)