[WebGPU] Unify core implementations of GEMM and MatMul (#24586)
### Description
This PR extract core implementations into gemm_utils.cc which is used to
generate shader both GEMM and Matmul ops. The core implemenations
included scalar and vec4 versions of GEMM and Matmul.
### Motivation and Context
There are many common codes for GEMM and Matmul, so we want to extra
common code to unify their implementations.

---------
Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>