pytorch
f3c25cd3 - [Quant][PT2.0] fix issues for rearranging weight observer for decomposed linear (#94296)

Commit
1 year ago
[Quant][PT2.0] fix issues for rearranging weight observer for decomposed linear (#94296) **Summary** Linear is decomposed to `t - addmm/mm` after `dynamo.export`. And weight's observer is inserted between `t` and `addmm/mm` in the first place. `_rearrange_weight_observer_for_addmm()` is then called to move the observer between weight and `t`. ``` before: weight - t - observer \ input - observer - addmm/mm after: weight - observer - t \ input - observer - addmm/mm ``` We found two issues of `_rearrange_weight_observer_for_addmm()`: - It does not call `m.recompile()` in the end, so it does not function correctly. - It does not support `aten.mm.default` which is from decomposed linear without bias. This PR fixes the two issues and renames the function to `_rearrange_weight_observer_for_decomposed_linear`. **Test plan** python test/test_quantization.py -k test_rearrange_weight_observer_for_decomposed_linear Pull Request resolved: https://github.com/pytorch/pytorch/pull/94296 Approved by: https://github.com/jgong5, https://github.com/andrewor14
Author
Committer
Parents
Loading