transformers
47d77657 - fix gemma4 gradient accumulation loss and last token incorrect labels (#45354)

Commit
53 days ago
fix gemma4 gradient accumulation loss and last token incorrect labels (#45354) * fix gemma4 gradient accumulation loss and last token incorrect labels * modular + also gemma3n --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Author
Parents
Loading