onnxruntime
1c846210 - Adding ARM64 depthwise convolution kernel for symmetric quantization (#9655)

Commit

4 years ago

Adding ARM64 depthwise convolution kernel for symmetric quantization (#9655) Adding ARM64 depthwise convolution kernel for symmetric quantization Motivation and Context Two improvements against current kernel code : 1. Signed int8 based instructions, no need to extend from 8b to 16b before multiplication. 2. Unrolled loop with manual software pipelining Co-authored-by: Chen Fu <fuchen@microsoft.com>

References

#9655 - Adding ARM64 depthwise convolution kernel for symmetric quantization

Author

chenfucn

Parents

9f4e8cf6

onnxruntime 1c846210 - Adding ARM64 depthwise convolution kernel for symmetric quantization (#9655)

onnxruntime
1c846210 - Adding ARM64 depthwise convolution kernel for symmetric quantization (#9655)