[FSDP] Full state_dict rank0 only and CPU offload
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75908
Adds a `FullStateDictConfig` that users can use to allow full state dict checkpoint with rank 0 only and CPU offload. We simply dispatch these args into summon_full_params.
Example:
```
with fsdp.state_dict_type(full_state_dict, StateDictConfig(offload=True,rank0=True):
state_dict = model.state_dict()
```
Differential Revision: [D35663270](https://our.internmc.facebook.com/intern/diff/D35663270/)
Approved by: https://github.com/zhaojuanmao