pytorch
f4b7c10e - fix resnet50_quantized_qat and mobilenet_v2_quantized_qat <> functionalization (#83339)

Commit
2 years ago
fix resnet50_quantized_qat and mobilenet_v2_quantized_qat <> functionalization (#83339) This won't actually fix the issue until we make FakeTensor always-on for AOTAutograd. I confirmed with the following benchmark (with `normalize_ir=False` and `use_functionalize=True`) in the dynamo/functorch config (run inside the `torch dynamo` repo): ``` terminal...$ python benchmarks/torchbench.py --training --devices=cuda --accuracy-aot-nop --generate-aot-autograd-stats --use-eval-mode --isolate --only=mobilenet_v2_quantized_qat cuda train mobilenet_v2_quantized_qat 0.967x p=0.00 terminal...$ python benchmarks/torchbench.py --training --devices=cuda --accuracy-aot-nop --generate-aot-autograd-stats --use-eval-mode --isolate --only=resnet50_quantized_qat cuda train resnet50_quantized_qat 0.943x p=0.00 ``` I explained a bit more in the comment: quantized models use a running-mean style op, `fused_moving_avg_obs_fake_quant`, that takes in the running min/max stored on the module and mutates them, potentially resizing them. That causes `AOTAutograd` to complain: AOTAutograd first takes views of the inputs (using `.detach().requires_grad_(grad)`), and plumbs them through the function to figure out what output to trace the backward with. These new inputs now have `TensorImpl::allow_tensor_metadata_change_ = false`, which causes the op to fail when it tries to resize the running counter variables. Once we're always using fake tensors, we shouldn't need to use `.detach().requires_grad_()` anymore (since we already have fresh fake tensors to trace with). Pull Request resolved: https://github.com/pytorch/pytorch/pull/83339 Approved by: https://github.com/ezyang
Author
Committer
Parents
Loading