pytorch
13640bf9 - disableing quantizing gradient in 8bw (#101739)

Commit
2 years ago
disableing quantizing gradient in 8bw (#101739) Summary: Quantizing a *gradient* is not applicable to complex ASR model. Gradient in INT8 f438266519 Gradient in FP32 f438109197 Clearly two WER shows the limitation with quantizing a gradient. As of now, we are okay with simply enabling quantized backpropagation but computing gradient in FP32. It already saves a memory due to model size. Test Plan: Signals Differential Revision: D45965552 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101739 Approved by: https://github.com/izaitsevfb
Committer
Parents
Loading