[Quant][fx][bc-breaking] Replace is_reference with convert_to_reference (#80091)
Summary: This PR removes the is_reference flag from the existing
convert_fx API and replaces it with a new convert_to_reference
function. This separates (1) converting the prepared model to a
reference model from (2) lowering the reference model to a quantized
model, enabling users to call their custom lowering function for
custom backends. For the native fbgemm backend, for example, the
following are equivalent:
```
from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx
prepared = prepare_fx(model, ...)
quantized = convert_fx(prepared, ...)
```
```
from torch.ao.quantization.fx import lower_to_fbgemm
from torch.ao.quantization.quantize_fx import (
prepare_fx,
convert_to_reference
)
prepared = prepare_fx(model, ...)
reference = convert_to_reference(prepared, ...)
quantized = lower_to_fbgemm(reference, ...)
```
Note that currently `lower_to_fbgemm` takes in two other arguments
that are difficult for users to provide. A future commit will remove
these arguments to make the helper function more user friendly.
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Reviewers: jerryzh168, vkuzo
Subscribers: jerryzh168, vkuzo
Differential Revision: [D37359946](https://our.internmc.facebook.com/intern/diff/D37359946)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80091
Approved by: https://github.com/jerryzh168