pytorch
9a24e573 - [Quant][fx] Add quant and scale ranges to BackendConfig

Commit View On GitHub

Commit

1 year ago

[Quant][fx] Add quant and scale ranges to BackendConfig **Summary:** This commit adds the following constraints to BackendConfig: quant_min_lower_bound quant_max_upper_bound scale_min_lower_bound scale_max_upper_bound This is motivated by QNNPACK constraints on qint8 weight values and the min scale value. Actually enforcing these constraints in the QNNPACK BackendConfig will follow in a future commit. Today, users can also specify the above constraints through QConfigs, and these settings may not necessarily match the ones specified in the BackendConfig. In this case, we will handle the discrepancy as follows: (1) Require QConfig quant ranges to fall within the backend's (2) Require QConfig min scale value (eps) >= backend's (3) Require QConfig to specify quant range if the backend specified one (4) Require QConfig to specify min scale value (eps) if the backend specified one If any of the above fails, then we will ignore this QConfig and log a warning. Note that this is consistent with existing handling, which ignores the QConfig if its dtypes don't match what the backend supports. Public API changes: * Previous API, still supported after this commit: ``` dtype_config = DTypeConfig( input_dtype=torch.quint8, output_dtype=torch.quint8, weight_dtype=torch.qint8, bias_dtype=torch.float, ) ``` * New API: ``` dtype_config = DTypeConfig( input_dtype=DTypeWithConstraints( dtype=torch.quint8, quant_min_lower_bound=0, quant_max_upper_bound=127, scale_min_lower_bound=2 ** -12, ), output_dtype=DTypeWithConstraints( dtype=torch.quint8, quant_min_lower_bound=0, quant_max_upper_bound=127, scale_min_lower_bound=2 ** -12, ), weight_dtype=DTypeWithConstraints( dtype=torch.qint8, quant_min_lower_bound=-128, quant_max_upper_bound=127, scale_min_lower_bound=2 ** -12, ), bias_dtype=torch.float, ) ``` Note that scale_max is currently not used because there is no existing mechanism to enforce this on the observer. In the future, we can validate this as well if there is a use case. **Test Plan:** python test/test_quantization.py TestBackendConfig.test_dtype_with_constraints python test/test_quantization.py TestQuantizeFx.test_backend_config_scale_min python test/test_quantization.py TestQuantizeFx.test_backend_config_quantization_range **Reviewers:** jerryzh168, vkuzo **Subscribers:** jerryzh168, vkuzo ghstack-source-id: ed0958dc71c1efb6595a52b562743d225a061085 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85200

References

backend_config_constraints

Author

andrewor14

Parents

e746fff8

pytorch 9a24e573 - [Quant][fx] Add quant and scale ranges to BackendConfig

Commit

pytorch
9a24e573 - [Quant][fx] Add quant and scale ranges to BackendConfig