[functorch] Use empty tensor instead of efficient zeros tensor (pytorch/functorch#406)
Why?
- efficient zeros tensor doesn't have efficient implementation for
convolution_backward yet
- allocating an empty tensor on CUDA doesn't actually launch any
kernels (as compared to a zero-filled tensor)