transformers
31bbef04 - Fix how we compute the final non-padding token for ForSequenceClassification models (#35911)

Commit

1 year ago

Fix how we compute the final non-padding token for ForSequenceClassification models (#35911) * Fix how we compute the final non-padding token for Gemma (and probably other models) * .size() -> .shape[] * Propagating changes to other models * Propagating changes to other models * Change it for all ForSequenceClassification models * Fix batch dim * More TF fixes * Copy the TF fix around as well * Correct layer name for TFCTRL * Cleaner .to() * Clean up the nested if-else * Use argmax() instead of .max().values

Author

Rocketknight1

Committer

MekkCyber

Parents

ac4acde4

transformers 31bbef04 - Fix how we compute the final non-padding token for ForSequenceClassification models (#35911)

transformers
31bbef04 - Fix how we compute the final non-padding token for ForSequenceClassification models (#35911)