pytorch
16c62a6d - [PyTorch Edge] Optimize Dequantize Tensor with Intrinsics (#65844)

Commit
4 years ago
[PyTorch Edge] Optimize Dequantize Tensor with Intrinsics (#65844) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65844 When run on [Partially Quantized Mobile Vision Transformer Model](https://www.internalfb.com/diff/D30648171), with config from rebasing onto v4 of D31869106 Before: [AIBench Run (128ms)](https://www.internalfb.com/intern/aibench/details/309792316534505) [Perf Report](https://interncache-all.fbcdn.net/manifold/aibench/tree/mobile/pt/profiling_reports/model_perf_1635881079420.html) After: [AIBench Run (117ms)](https://www.internalfb.com/intern/aibench/details/20433505461364) [Perf Report](https://interncache-all.fbcdn.net/manifold/aibench/tree/mobile/pt/profiling_reports/model_perf_1635881527831.html) Total events spent on at::native::dequantize_quantized reduced from 1.97 Billion to 0.97 Billion (~50% Reduction) ghstack-source-id: 142166373 Test Plan: To run quantized_test - Clone open source repo - Set ANDROID_NDK and ANDROID_SDK - Build with ```BUILD_MOBILE_BENCHMARK=1 BUILD_MOBILE_TEST=1 ANDROID_DEBUG_SYMBOLS=1 BUILD_LITE_INTERPRETER=0 ANDROID_ABI=arm64-v8a ./scripts/build_android.sh -DANDROID_CCACHE=$(which ccache) -DBUILD_BINARY=ON``` - Move ```build_android/bin/quantized_test``` to devserver - Use one world to connect to android device (ex. ```one_world android device pixel-3a```) - In another terminal: Make quantized_test executable (```chmod +x quantized_test```), copy it to android device (```adb push quantized_test /data/local/tmp```), and run it (```adb shell /data/local/tmp/quantized_test```) Results: {F676102702} Also ```buck test mode/dev //caffe2/aten:quantized_test``` passes To test performance on [Partially Quantized Mobile Vision Transformer Model](https://www.internalfb.com/diff/D30648171) with AI Bench: - Save this config file: P466124028 (for example: D31869106) - Before or after the changes in this diff, run ```buck run aibench:run_bench -- -b benchmark_mobile_vision_transformer_model_config.json --platform android/arm64 --framework pytorch --remote --devices Pixel-3a-11-30 --force_profile``` Reviewed By: kimishpatel Differential Revision: D31066997 fbshipit-source-id: 9067e683e0181aa13a2b636b68ac4fe5a4b2e618
Author
Parents
Loading