fx quant: refactor observer insertion

Commit

3 years ago

fx quant: refactor observer insertion Summary: tl;dr; rewrites the FX graph mode quantization observer insertion to be easier to understand and extend. The key conceptual difference from before is: * before: for each node, observers are always inserted to the output of the current node, even if they are needed for the next node. This is hard to reason about. * after: for each node, observers are inserted to the inputs (if needed, as calculated by the dtype of the argument and dtype of current node) and to the output (if needed for the type of pattern and qconfig). There is no knowledge of future nodes needed to insert observers for the current node. This allows us to significantly simplify various things: * all new observers needed for a node are inserted together. This makes it easier to understand and debug things. We add an invariant that node X will never change any observers inserted by any preceding or subsequent node, so to debug an issue the user can just understand what is happening for node X, without having to understand what happens before or after it. * all the state tracking of activation_post_process_map and activation_post_process_indices are removed, instead observers are looked up by graph traversals * since there is no longer a need for overlapping graph passes which mutate each other's interemediate state, it is easier to understand what the rules are for inserting observers, and to create new rules in the future. Test Plan: ``` # all OSS tests pass python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Differential Revision: D28241864 Reviewed By: jerryzh168 Pulled By: vkuzo fbshipit-source-id: 950d58972d26362808564cc0a2dfb30413a3734d

Author

vkuzo

Committer

facebook-github-bot

Parents

2436377a

pytorch 4f50fdc2 - fx quant: refactor observer insertion

Commit

pytorch
4f50fdc2 - fx quant: refactor observer insertion