[WebGPU] Fix wait logic for inflight jobs (#20096)
* Enable tmate debugging for investigating thread safety issue
* Refactor wait and submit to operate on vector<wgpu::FutureWaitInfo>, and fix wait to delete only the future that is completed.
* Cleanup
* Remove clear change and run clang-format
* Cleanup