[functorch] vmap: chunk_size support (#91157)
Ref: https://github.com/pytorch/functorch/issues/680
We introduce a kwarg `chunk_size` in vmap.
Also, we leverage most of the code from `chunk_vmap` (except for chunking the input based on `chunk_size`)
Benchmarks from https://github.com/pytorch/functorch/pull/774 apply.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91157
Approved by: https://github.com/zou3519