Handle stride > 1 with im2col in CUDA thnn conv2d (#54080)
Summary:
The fallback thnn 2d convolution uses `im2col` to get patches and `gemm` to implement convolution .
I has a shortcut to use `gemm` directly for kernel size 1, but this only works for stride == 1 and padding == 0.
This PR adds checks for stride == 1 and padding == 0 to determining whether `im2col` can be skipped.
Fixes https://github.com/pytorch/pytorch/issues/54036
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54080
Reviewed By: ejguan
Differential Revision: D27170482
Pulled By: zou3519
fbshipit-source-id: 055d6502239d34945934de409d78144d8a5c56f4