disable gemm f16 on CPU (#19744)
### Description
<!-- Describe your changes. -->
Temporarily disable fp16 gemm on CPU because it usually needs a
following Cast which offsets the gain. Need more fp16 operators
implementation and performance tuning.
Also fix a fusion error of LayerNormalization.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->