Fix incorrect pad indices in AveragePool count_include_pad computation (#27375)
### Description
Fixes #26708
`AveragePool2DTask` and `AveragePool3DTask` in `pool_functors.h` used
incorrect
pad indices when computing `hend` (and `wend` in 3D) for the
`count_include_pad`
divisor calculation.
ONNX pads format for 2D is `[h_begin, w_begin, h_end, w_end]`.
The code was using `pads[1]` (w_begin) instead of `pads[2]` (h_end) to
clamp `hend`,
causing the padding region to be excluded from the divisor when
`count_include_pad=1` with asymmetric pads (e.g. bottom/right only).
The same issue existed in `AveragePool3DTask` where pads format is
`[h_begin, w_begin, d_begin, h_end, w_end, d_end]`:
- `hend` used `pads[1]` instead of `pads[3]`
- `wend` used `pads[3]` instead of `pads[4]`
### Example
Input `[[1,1],[1,1]]` with `pads=[0,0,1,1]`, `kernel_shape=[2,2]`,
`count_include_pad=1`:
- **Before (wrong):** `[[1.0, 0.5], [1.0, 0.5]]`
- **After (correct):** `[[1.0, 0.5], [0.5, 0.25]]`
Verified against ONNX ReferenceEvaluator.
### Changes
- `pool_functors.h`: Fix pad indices in `AveragePool2DTask` and
`AveragePool3DTask`
- `pool_op_test.cc`: Add regression test
`AveragePool_CountIncludePad_AsymmetricPads`
### Motivation and Context
When using `AveragePool` with `count_include_pad=1` and asymmetric
padding
(e.g. only bottom and right), the average was computed with the wrong
divisor,
producing results inconsistent with the ONNX spec.