llama.cpp
4d828bd1 - ggml webgpu: Clean up per-thread parameter buffer pool and job submission logic (#19772)

Commit
6 days ago
ggml webgpu: Clean up per-thread parameter buffer pool and job submission logic (#19772) * Allow webgpu_buf_pool to resize if needed, remove inflight_threads, and replace inflight_threads with num_kernels for submission * Run clang-format * Keep track of num batched kernels that have not been submitted yet * Run clang-format * Increase buf pool max size * Increase param buf pool init size * Remove webgpu buf pool resizing * Merge with master * Add buffer pool growth * Move buffer pool growth outside of lock * Reduce max pool size to 32 * Run clang-format * Only resize param buf pool
Author
Parents
Loading