[webgpu] Fix MatMulNBits prefill shader synchronization (#23663)
### Description
This commit adds a `workgroupBarrier` to the MatMulNBits prefill shader
to ensure proper synchronization between workgroup invocations,
resolving a potential race condition.
### Motivation and Context
See above.