[QNN EP] MaxPool input rank-3 auto pad bug fix (#24827)
- Previously, padding for rank-3 MaxPool was only computed for auto_pad="NOTSET", using the final output shape.
- Identified a broader issue during auto_pad="VALID" implementation: padding must be derived from the recalculated output shape.
- Added unit tests to cover all use cases of auto_pad.
- Enabled the failing unit test in the cpu pool test
### Description
This PR fixes an issue in the padding calculation logic for rank-3 MaxPool operations when using auto_pad. The bug stemmed from using the final output shape (rank-3) to compute padding, rather than the correct intermediate shape (rank-4) that MaxPool actually operates on. The logic has been updated to use the reshaped rank-4 output for accurate padding
computation. Unit tests have been added to validate behavior across all auto_pad modes.
### Motivation and Context
While implementing support for auto_pad="VALID" in MaxPool, we discovered that the padding for MaxPool rank-3 was being calculated using the final output shape, which is rank-3. However, MaxPool internally operates on a reshaped rank-4 tensor (via pre- and post-processing reshapes). As a result, the padding logic was misaligned with the actual shape used during pooling, leading to test failures.