[WebNN] Support more features for GQA (#27234)
Add support for GroupQueryAttention with:
- do_rotary=true (cos_cache/sin_cache inputs)
- Packed QKV (optional key/value inputs)
- Optional past_key/past_value for prefill mode
- Remove fp16->fp32 casting workaround
Add ApplyRotaryEmbedding helper function.
Fix decode stage by using qkv_sequence_length to distinguish prefill vs
decode, and use runtime seqlens_k instead of static past_sequence_length
for rotary position calculation.