amend D14778810 (#18902)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18902
Fix in D14778810 had an issue that when we fallback to acc32 because the density of outlier is too high W_quantized_ is already modified. In this diff we first just count the number of outliers (without modifying W_quantized_) and only when density is low enough and no need for fallback we modify W_quantized_ and construct an outlier matrix.
Reviewed By: jspark1105
Differential Revision: D14785256
fbshipit-source-id: 03933110a4ca7409686a06b18a9bb921f8657950