[quant][graph] Remove redundant aten::wait calls in the graph (#45257)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45257
Currently we inline fork-wait calls when we insert observers for quantization
In the case where fork and wait are in different subgraphs, inlining the fork-wait calls
only gets rid of the fork. This leaves the aten::wait call in the graph with a torch.Tensor as input,
which is currently not supported.
To avoid this we check to make sure input to all wait calls in the graph is of type Future[tensor]
in the cleanup phase
Test Plan:
python test/test_quantization.py TestQuantizeJitPasses.test_quantize_fork_wait
Imported from OSS
Reviewed By: qizzzh
Differential Revision: D23895412
fbshipit-source-id: 3c58c6be7d7e7904eb6684085832ac21f827a399