[shard] add ShardedOptimizer (#68607)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68607
This PR added ShardedOptimizer and a API to get module parameters along with ShardedTensor param, it allows user to use this Optimizer Wrapper to construct a optimizer that involves ShardedTensor
The state_dict support will be a follow up diff
ghstack-source-id: 145532834
Test Plan: python test_sharded_optim.py
Reviewed By: pritamdamania87
Differential Revision: D32539994
fbshipit-source-id: a3313c6870d1f1817fc3e08dc2fc27dc43bef743