Add implementation of WebGPU EP (#22591)
### Description
This PR adds the actual implementation of the WebGPU EP based on
https://github.com/microsoft/onnxruntime/pull/22318.
This change includes the following:
<details>
<summary><b>core framework of WebGPU EP</b></summary>
- WebGPU EP factory classes for:
- handling WebGPU options
- creating WebGPU EP instance
- creating WebGPU context
- WebGPU Execution Provider classes
- GPU Buffer allocator
- data transfer
- Buffer management classes
- Buffer Manager
- BufferCacheManager
- DisabledCacheManager
- SimpleCacheManager
- LazyReleaseCacheManager
- BucketCacheManager
- Program classes
- Program (base)
- Program Cache Key
- Program Manager
- Shader helper classes
- Shader Helper
- ShaderIndicesHelper
- ShaderVariableHelper
- Utils
- GPU Query based profiler
- compute context
- string utils
- Miscs
- Python binding webgpu support (basic)
</details>
<details>
<summary><b>Kernel implementation</b></summary>
- onnx.ai (default opset):
- Elementwise (math): Abs, Neg, Floor, Ceil, Reciprocal, Sqrt, Exp, Erf,
Log, Sin, Cos, Tan, Asin, Acos, Atan, Sinh, Cosh, Asinh, Acosh, Atanh,
Tanh, Not, Cast
- Elementwise (activation): Sigmoid, HardSigmoid, Clip, Elu, Relu,
LeakyRelu, ThresholdedRelu, Gelu
- Binary (math): Add, Sub, Mul, Div, Pow, Equal, Greater,
GreaterOrEqual, Less, LessOrEqual
- (Tensors): Shape, Reshape, Squeeze, Unsqueeze
- Where
- Transpose
- Concat
- Expand
- Gather
- Tile
- Range
- LayerNormalization
- com.microsoft
- FastGelu
- MatMulNBits
- MultiHeadAttention
- RotaryEmbedding
- SkipLayerNormalization
- LayerNormalization
- SimplifiedLayerNormalization
- SkipSimplifiedLayerNormalization
</details>
<details>
<summary><b>Build, test and CI pipeline integration</b></summary>
- build works for Windows, macOS and iOS
- support onnxruntime_test_all and python node test
- added a new unit test for `--use_external_dawn` build flag.
- updated MacOS pipeline to build with WebGPU support
- added a new pipeline for WebGPU Windows
</details>
This change does not include:
- Node.js binding support for WebGPU (will be a separate PR)