llama.cpp
82677a6e - ggml-webgpu: compute pass batching and removing profiling overhead (#21873)

Commit
30 days ago
ggml-webgpu: compute pass batching and removing profiling overhead (#21873) * Update register tiling matmul to use f32 accumulation * fix profiling code * Fix register tiling matmul for chrome, i'm blaming dawn * Update batch tuning value for iOS * compile fix * Fix use of new load function * Move to a single query set for GPU profiling * Move to batching compute passes when not profiling * Refactor build_multi * remove iOS throttling now that we're batching compute passes
Author
Parents
Loading