Use output memory format based on input for cudnn_convolution_relu (#62482)
Summary:
Currently when cudnn_convolution_relu is passed a channels last Tensor it will return a contiguous Tensor. This PR changes this behavior and bases the output format on the input format.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62482
Reviewed By: ngimel
Differential Revision: D30049905
Pulled By: cpuhrsch
fbshipit-source-id: 98521d14ee03466e7128a1912b9f754ffe10b448