add more Python interface functions to make quantization simpler (#18246)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18246
Simplifies histogram collection and quantization process.
Histogram collection before this diff was something like this
```
from caffe2.quantization.server import dnnlowp_pybind11
...
dnnlowp_pybind11.ObserveHistogramOfOutput(hist_file)
for ...
workspace.RunNet(predict_net)
dnnlowp_pybind11.ClearNetObservers() # This is to trigger Stop function in the observer to dump out histogram file but this can have unintended consequence of also clearing all the other useful observers we attached
```
After this diff we can
```
workspace.CreateNet(predict_net) # Note we need to create net to have a net to attach observer
histogram_observer = dnnlowp_pybind11.AddHistogramObserver(predic_net, hist_file)
for ...
workspace.RunNet(predict_net)
predict_net.RemoveObserver(histogram_observer)
```
Choosing quantization parameters of weights before this diff was something like this
```
dnnlowp_pybind11.ObserveHistogramOfOutput(weight_hist_file)
workspace.RunNetOnce(init_net)
dnnlowp_pybind11.ClearNetObservers() # Has same issue as the histogram collection example above
dnnlowp_pybind11.RegisterQuantizationParamsWithHistogram(
weight_hist_file, is_weight=True, qparams_output_file_name=qparams_file
)
workspace.CreateNet(init_net, overwrite=True)
dnnlowp_pybind11.ClearNetObservers()
logger.info("Loading quantization params from {}".format(qparams_file))
blobs_to_qparams = {}
with open(qparams_file) as f:
lines = f.readlines()
for line in lines:
op_id, op_type, output_id, tensor_name, mini, maxi, scale, zero_point, precision = (
line.split()
)
op_id = int(op_id)
output_id = int(output_id)
op = net.Proto().op[op_id]
if op_type != op.type or op.output[output_id] != tensor_name:
print(
"Corrupt qparams file {} {} {} {} {}".format(
qparams_file, op_type, op.type, op.output[output_id], tensor_name
)
)
blobs_to_qparams[tensor_name] = QuantizationParam(float(scale), int(zero_point))
```
After this diff this can be simplified to
```
blobs_to_qparams = {}
for op in init_net.Proto().op:
for output in op.output:
scale, zero_point = dnnlowp_pybind11.ChooseQuantizationParams(output)
blobs_to_qparams[output] = QuantizationParam(scale, zero_point)
```
Reviewed By: dskhudia
Differential Revision: D14544694
fbshipit-source-id: 4fd06cd63256201e2e9d15c39f503138d1be53c2