Fix broken Pooling CUDA NHWC Ops and ensure NCHW / NHWC parity. (#19889)
### Description
Fixed all CUDA NHWC Pooling operations which were broken and enabled the
NHWC CUDA pooling tests. Disabled all pooling tests which are not
supported by the CUDA EP.
### Motivation and Context
Ensure parity between CUDA NHWC / NCHW and work towards 100% tests
enabled for the CUDA EP / CUDA NHWC EP.
---------
Co-authored-by: Tianlei Wu <tlwu@microsoft.com>