[FSDP] First implementation of ParamExecOrderWrapPolicy (non-recursive wrap policy) (#79238)
This is the first PR for a wrapping policy that wraps parameters and performs the communication scheduling based on the parameter execution order in the forward pass (also called non-recursive wrapping policy).
This PR includes:
- The basic API for using this policy,
- A helper function to get the parameter execution order in the first forward and backward pass.
Other parts will be implemented in future PRs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79238
Approved by: https://github.com/zhaojuanmao, https://github.com/awgu