[functorch] Added grid_sample backward batch rule (pytorch/functorch#284)
* Added grid_sample backward batch rule
Description:
- Added grid_sample backward batch rule: CPU and CUDA
- Updated tests
Notes:
I had to expand on dim 0 in most of the cases and could not use
tricks like in forward pass when batch dim is merged either with channel or H_out
due to wrong grid grads in these cases
* Code updates according to the review
* Updated OutOfPlacePlumbing.cpp to the latest pytorch