transformers
6890d196 - Shifting labels for causal LM when using label smoother (#17987)

Commit

3 years ago

Shifting labels for causal LM when using label smoother (#17987) * Shifting labels for causal LM when using label smoother When training CausalLM, loss is computed within model's foward() function and labels are shifted internally. However, if label smoothing is applied, loss is computed in trainer's compute_loss function and labels are not shifted. This causes unintended confusion during the alignment of labels and corresponding inputs. This commit is for resolving this confusion. Resolves #17960 On branch shift_labels_for_causalLM Changes to be committed: modified: src/transformers/trainer.py modified: src/transformers/trainer_pt_utils.py * Update trainer.py * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

References

#17987 - Shifting labels for causal LM when using label smoother

Author

seungeunrho

Parents

6f0723a9

transformers 6890d196 - Shifting labels for causal LM when using label smoother (#17987)

transformers
6890d196 - Shifting labels for causal LM when using label smoother (#17987)