PR #20173 ggml webgpu: faster normal quant and some k-quant matrix operations, better shader parameter handling