pytorch
27f4a78b - Add benchmark for per channel tensor quantization (#46017)

Commit
4 years ago
Add benchmark for per channel tensor quantization (#46017) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46017 Currently on mobile only per tensor quantization is optimized for mobile using ARM intrinsics. This benchmark is dded to help gauge performance improvement on mobile after performing the same optimizations for per channel quantization. Test Plan: Build for ARM Neon ``` BUILD_MOBILE_BENCHMARK=1 BUILD_MOBILE_TEST=1 ANDROID_DEBUG_SYMBOLS=1 BUILD_PYTORCH_MOBILE=1 ANDROID_ABI="armeabi-v7a with NEON" ./scripts/build_android.sh -DANDROID_CCACHE=$(which ccache) -DBUILD_BINARY=ON ``` Build for ARM64 ``` BUILD_MOBILE_BENCHMARK=1 BUILD_MOBILE_TEST=1 ANDROID_DEBUG_SYMBOLS=1 BUILD_PYTORCH_MOBILE=1 ANDROID_ABI=arm64-v8a ./scripts/build_android.sh -DANDROID_CCACHE=$(which ccache) -DBUILD_BINARY=ON ``` Then run the benchmark binary over adb shell. Note that by android cpu is not frequency locked by default and can lead to noisy benchmark results, but this can be changed by running the following for every cpu. ``` adb shell "echo userspace > /sys/devices/system/cpu/${cpu}/cpufreq/scaling_governor" adb shell "echo '2000000' > /sys/devices/system/cpu/${cpu}/cpufreq/scaling_setspeed" adb push build_android/bin/quantize_per_channel /data/local/tmp/ adb shell "/data/local/tmp/quantize_per_channel" ``` Reviewed By: kimishpatel Differential Revision: D24286488 fbshipit-source-id: 1e7942f0bb3d9d1fe172409d522be9f351a485bd
Author
Parents
Loading