transformers
8672bcda - Adafactor: avoid updating group["lr"] attributes (#9751)

Commit
5 years ago
Adafactor: avoid updating group["lr"] attributes (#9751) This affects Adafactor with relative_step=False and scale_parameter=True. Updating group["lr"] makes the result of ._get_lr() depends on the previous call, i.e., on the scale of other parameters. This isn't supposed to happen.
Author
Parents
Loading