transformers
68a894a5 - Fix uninitialized parameter in conformer relative attention. (#18368)

Commit
3 years ago
Fix uninitialized parameter in conformer relative attention. (#18368) `torch.Tensor` creates an unitialized tensor (as via `torch.empty`), this leads to undeterministic behavior, poor initialization, and nans if you have unlucky init. The paper does not specify the initialization for bias terms, so I guess zero seems like a good choice - no bias initially. `torch.Tensor` is usually populated with zeros, so this fix will be close to the intended behavior: ``` >>> torch.Tensor(100, 100).sum() tensor(0.) >>> torch.Tensor(100, 100).sum() tensor(nan) >>> torch.Tensor(100, 100).sum() tensor(0.) ```
Author
Piotr Dabkowski
Parents
Loading