make sharding strategy configurable and support zero2 algorithm (#73819)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73819
adding a new sharding_strategy config in FSDP API to support different data parallel algorithm. also add support for zero2 algorithm, which will only shard optimizer states and grads
ghstack-source-id: 151454460
Test Plan: unit tests
Reviewed By: rohan-varma
Differential Revision: D34662583
fbshipit-source-id: 14c6e0c0054692ecd76512c025d60deb4964ec5f
(cherry picked from commit 51382e882447b4756c4ee6d94ce0939a25955b00)