[FSDP()] Require args as kwargs for `fully_shard()` (#89573)
I am not aware of any users of `FullyShardedDataParallel` that pass arguments after `process_group` positionally. I.e., I believe users pass arguments as keyword arguments. This PR formalizes this for `fully_shard()`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89573
Approved by: https://github.com/mrshenli