transformers
Fix how we compute the final non-padding token for ForSequenceClassification models
#35911
Merged

Commits
  • Fix how we compute the final non-padding token for Gemma (and probably other models)
    Rocketknight1 committed 1 year ago
  • .size() -> .shape[]
    Rocketknight1 committed 1 year ago
  • Propagating changes to other models
    Rocketknight1 committed 1 year ago
  • Propagating changes to other models
    Rocketknight1 committed 1 year ago
  • Change it for all ForSequenceClassification models
    Rocketknight1 committed 1 year ago
  • Fix batch dim
    Rocketknight1 committed 1 year ago
  • More TF fixes
    Rocketknight1 committed 1 year ago
  • Copy the TF fix around as well
    Rocketknight1 committed 1 year ago
  • Correct layer name for TFCTRL
    Rocketknight1 committed 1 year ago
  • Cleaner .to()
    Rocketknight1 committed 1 year ago
  • Clean up the nested if-else
    Rocketknight1 committed 1 year ago
  • Use argmax() instead of .max().values
    Rocketknight1 committed 1 year ago
Loading