Dynamo runner: add FSDP handcrafted module wrapping policy (#111505)
Summary:
The default size based auto wrap policy may not be representative of actual usage of the models. We add support for a few handpicked models, and fallback to the size based policy.
sample command:
`PYTHONPATH=~/benchmark/ python benchmarks/dynamo/torchbench.py -dcuda --training --backend=inductor --multiprocess --performance --only nanogpt --fsdp`
1.257x
1.256x
1.257x
1.252x
1.257x
1.262x
1.258x
1.272x
X-link: https://github.com/pytorch/pytorch/pull/111505
Approved by: https://github.com/H-Huang, https://github.com/xuzhao9
Reviewed By: izaitsevfb
Differential Revision: D50676554
Pulled By: xmfan
fbshipit-source-id: d2f514f178648bc7f593418f13f6a664694069a8