transformers
Fix how we compute the final non-padding token for ForSequenceClassification models
#35911

Merged

Fix how we compute the final non-padding token for ForSequenceClassification models #35911

Rocketknight1 merged 12 commits into main from fix_sequence_classification_padding_side

Fix how we compute the final non-padding token for Gemma (and probabl…

f74d4dda

.size() -> .shape[]

067d99a2

Propagating changes to other models

3e381f41

Propagating changes to other models

7cc13967

Change it for all ForSequenceClassification models

d3b7c994

Fix batch dim

1c11edc4

More TF fixes

13a670ee

Copy the TF fix around as well

c671810f

Correct layer name for TFCTRL

8ccde63d

Rocketknight1 force pushed from 4620592e to 8ccde63d 1 year ago

Cyrilvallez commented on 2025-01-29

Cleaner .to()

8c69579f

Clean up the nested if-else

172cfd76

Use argmax() instead of .max().values

26d554e3

Cyrilvallez approved these changes on 2025-01-30

Rocketknight1 merged 694aaa7f into main 1 year ago

Rocketknight1 deleted the fix_sequence_classification_padding_side branch 1 year ago

Reviewers

Cyrilvallez

Assignees

No one assigned

Labels

None yet

Milestone

No milestone