[webgpu] Add Matmul8bits Support (#24546)
### Description
This PR adds the support for 8-bit quantization in the `MatMulNBits`
operation in WebGPU.
It does below things:
1. Unify to use `MatMulNBitsProgram` as the fallback path which is the
original generation path for block size = 32. Now make it support any
blocks size without limitations. And remove the original complicated
programs.
2. Enable `MatMulNBitsWideTileProgram` for all platforms.