pytorch
27f4a78b - Add benchmark for per channel tensor quantization (#46017)

Commit

4 years ago

Add benchmark for per channel tensor quantization (#46017) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46017 Currently on mobile only per tensor quantization is optimized for mobile using ARM intrinsics. This benchmark is dded to help gauge performance improvement on mobile after performing the same optimizations for per channel quantization. Test Plan: Build for ARM Neon ``` BUILD_MOBILE_BENCHMARK=1 BUILD_MOBILE_TEST=1 ANDROID_DEBUG_SYMBOLS=1 BUILD_PYTORCH_MOBILE=1 ANDROID_ABI="armeabi-v7a with NEON" ./scripts/build_android.sh -DANDROID_CCACHE=$(which ccache) -DBUILD_BINARY=ON ``` Build for ARM64 ``` BUILD_MOBILE_BENCHMARK=1 BUILD_MOBILE_TEST=1 ANDROID_DEBUG_SYMBOLS=1 BUILD_PYTORCH_MOBILE=1 ANDROID_ABI=arm64-v8a ./scripts/build_android.sh -DANDROID_CCACHE=$(which ccache) -DBUILD_BINARY=ON ``` Then run the benchmark binary over adb shell. Note that by android cpu is not frequency locked by default and can lead to noisy benchmark results, but this can be changed by running the following for every cpu. ``` adb shell "echo userspace > /sys/devices/system/cpu/${cpu}/cpufreq/scaling_governor" adb shell "echo '2000000' > /sys/devices/system/cpu/${cpu}/cpufreq/scaling_setspeed" adb push build_android/bin/quantize_per_channel /data/local/tmp/ adb shell "/data/local/tmp/quantize_per_channel" ``` Reviewed By: kimishpatel Differential Revision: D24286488 fbshipit-source-id: 1e7942f0bb3d9d1fe172409d522be9f351a485bd

Author

ajliu

Committer

facebook-github-bot

Parents

82b74bd9

pytorch 27f4a78b - Add benchmark for per channel tensor quantization (#46017)

pytorch
27f4a78b - Add benchmark for per channel tensor quantization (#46017)