[FSDP] Move tensor sharding logic to `FlatParamHandle` (#80000)
This moves the tensor sharding logic from `FullyShardedDataParallel` to `FlatParamHandle`. In particular, `_get_shard()` and its related subroutines are moved to `FlatParamHandle` as static methods.
The motivation is to start refactoring to move the broader FSDP sharding logic in `_shard_parameters()` to `FlatParamHandle` (as a part of the multiple parameter group and possibly future pluggable sharding efforts). In other words, in follow-ups, I hope to move
https://github.com/pytorch/pytorch/blob/cd089544631d3bcad3b620865543a649e2535cbf/torch/distributed/fsdp/fully_sharded_data_parallel.py#L1444-L1447
to be part of `FlatParamHandle`.
Differential Revision: [D37726060](https://our.internmc.facebook.com/intern/diff/D37726060)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80000
Approved by: https://github.com/fegin