Add missing MatMulInteger int8 x uint8 support (#26744)
### Description
<!-- Describe your changes. -->
This PR adds support for the `MatMulInteger` operator when
input `A` is `int8` and input `B` is `uint8`, and adds unit tests
to cover this type combination.
According to the ONNX specification for `MatMulInteger`, the type
constraints are:
- `T1 ∈ {int8, uint8}`
- `T2 ∈ {int8, uint8}`
- `T3 = int32`
This means all four combinations `(T1, T2) = (int8,int8), (int8,uint8),
(uint8,int8), (uint8,uint8)` are valid. However, the implementation
was missing the `(int8, uint8)` registration, which caused a
`NOT_IMPLEMENTED` error at runtime for such models.
This PR aligns the kernel registration and tests with the ONNX spec.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fixes [#26743](https://github.com/microsoft/onnxruntime/issues/26743)
### Testing
- Added unit tests for the `A=int8, B=uint8` combination:
- `MatmulIntegerOpTest.MatMulInteger_int8_uint8_2D`
- `MatmulIntegerOpTest.MatMulInteger_int8_uint8_PerColumn_ND`
- All tests pass locally.