[FSDP] Dont move if on CPU (#77720)
After offline discussion, decided that by default moving CPU module to GPU is a bit too risky due to possible OOM during init issue.
Theoretically, we should not OOM because it is required for module that is being wrapped by FSDP to fit into GPU, i.e. during forward. But possibly can be temporary GPU tensors etc allocated during __init___ that break this assumption, it is better for now to allow users a way to init on CPU if needed.
We still warn to use `device_id` for faster init if model is on CPU.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77720
Approved by: https://github.com/zhaojuanmao