feat: Add BFloat16 support for Gemm and MatMul CPU operators
This commit introduces BFloat16 support for Gemm and MatMul operators on the CPU execution provider.
Key changes:
- Added BFloat16 data type and moved related files to onnxruntime/core/common.
- Implemented MlasBf16AccelerationSupported to detect hardware support for BFloat16.
- Added Gemm and MatMul kernels for BFloat16 using Eigen.
- Registered the new kernels for the CPU execution provider.
- Added unit tests for BFloat16 Gemm and MatMul.
- Fixed ambiguous comparison operators for BFloat16.