SemanticDiff

pytorch
aa0b2842 - Add optimized quantize function for ARM (#26867)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

4 years ago

Add optimized quantize function for ARM (#26867) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26867 Use caffe2::Int8Quantize for pytorch mobile. Currently this is only implemented for uint8 tensors and runs using NEON intrinsics. For all other cases it falls back to naive pytorch quantize_val implementation. Previously, naive implementation of quantize_val is slow on mobile, taking up more than 50% of the execution time. Results Before aten::quantize_per_tensor 42.893 ms Total model runtime 70.5ms After aten::quantize_per_tensor 0.340 ms Total model runtime 27.5ms Test Plan: Tested current python tests work python test/test_quantized.py TestQNNPackOps Also tested using quantized mobilenetV2 on mobile and compared output Imported from OSS Differential Revision: D17638732 fbshipit-source-id: 76445d1e415e6e502d05ba5b900e5e1d875fc1b0

Author

supriyar

supriyar

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading