Support activation broadcasting in XNNPACK Matmul (#24908)
### Description
1. Support activation broadcasting in XNNPACK Matmul
2. Fix a subtle bug when activations is 1-D
Per the existing gating logic, 1-D activations were allowed but the
batch being passed through did not account for it. The batch size passed
in was always `a->Shape()[0]` which is actually passing in the reduction
dimension (K). This is incorrect as for a 1-D activation input, a `1` is
to be prepended to the shape which meant that we should have actually
passed in `1` for the batch. This passed the relevant test but I think
it would have written outside the bounds of the output buffer because of
the non-unary batch being passed through.
### Motivation and Context
Resolve https://github.com/microsoft/onnxruntime/issues/24107
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>