[WebGPU] Reduce staging buffers for uploading intializers (#23968)
This change reduces the number of staging buffers used for uploading
initializers to the GPU. On the one hand, we early release the upload
staging buffers. On the other hand, we use the BufferMapExtendedUsages
feature of Dawn on UMA GPUs, which allows us to directly write into the
dest GPU buffer without the need of a staging buffer. To achieve this,
we need to ensure the UMA GPU buffers are mapped at creation. We have
BufferManager to be awared of OnSessionInitializationEnd(), so that it
can handle buffer Create() and Upload() calls properly.
Credits to @fs-eire for the overall design of implementation.