[Native WebGPU] Support shared memory version of ReduceOps (#24399)
### Description
Support shared memory version of ReduceOps
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->