[webgpu] support smooth softmax for non-FA GQA implementation (#25285)
### Description
support smooth softmax for non-FA GQA implementation
This change depends on:
- #25269
Work items:
- [x] support smooth softmax
- [x] support bias
- [x] support head sink (per-head smooth softmax)
The following will not be included in this PR:
- support for FlashAttention
- support sliding window