onnxruntime
35db7889 - [wasm] Use relaxed SIMD dot product in CopyPackA (#25165)

Commit
208 days ago
[wasm] Use relaxed SIMD dot product in CopyPackA (#25165) ### Description This change replaces the previous zero-extend + 16-bit accumulation sequence with a single wasm_i32x4_relaxed_dot_i8x16_i7x16_add operation to compute row sums directly on 8-bit data. ### Motivation and Context This update eliminates unpacking overhead and lifts the former constraints on k stride.
Author
Parents
Loading