[cuDNN][cuDNN V8 API] Match V7 API behavior for `channels_last` stride coercion for cuDNN (#88699)
For ConvNeXt failure in https://github.com/pytorch/torchdynamo/issues/1833
cuDNN V7 has some stride "fixing" code to coerce cuDNN to use channels-last in cases when allowed by size 1 strides that was omitted in V8, which seems to seems to lead to performance regressions. This PR patches in the same fix for V8.
CC @ngimel @ptrblck
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88699
Approved by: https://github.com/ngimel