Learnable Fake Quantizer Benchmark Test (#41429)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41429
This diff contains the benchmark test to evaluate the speed of executing the learnable fake quantization operator, both in the forward path and the backward path, with respect to both per tensor and per channel usages.
Test Plan:
Inside the path `torch/benchmarks/operator_benchmark` (The root directory will be `caffe2` inside `fbcode` if working on a devvm):
- On a devvm, run the command `buck run pt:fake_quantize_learnable_test`
- On a personal laptop, run the command `python3 -m pt.fake_quantize_learnable_test`
Benchmark Results (Locally on CPU):
Each sample has dimensions **3x256x256**; Each batch has 16 samples (`N=16`)
- Per Tensor Forward: 0.023688 sec/sample
- Per Tensor Backward: 0.165926 sec/sample
- Per Channel Forward: 0.040432 sec / sample
- Per Channel Backward: 0.173528 sec / sample
Reviewed By: vkuzo
Differential Revision: D22535252
fbshipit-source-id: e8e953ff2de2107c6f2dde4c8d5627bdea67ef7f