T5gemma2 (#41834) - SemanticDiff

Commit

87 days ago

T5gemma2 (#41834) * Fix small bug in T5Gemma 1 in __init__ * Add t5gemma2 model & configurations. * Add auto support * Add test case. * Add doctree. * Update positional embeddings to match latest update. * Style fix & add use of final_logit_softcapping for attributes check. * Update tests and embedding design. * Add t5gemma2 to image-text-to-text category. * Add T5Gemma2 doc. * remove unused imports. * minor update following comments. * minor style fixes. * fix config. * Update T5Gemma2 following Anton's comments: 1. Override _prepare_cache_for_generation to take care of cross-attention cache. 2. Move vision preprocessing from main model to encoder. 3. Clean and fix bugs in modular model. * Add T5Gemma2VisionConfig. * Minor updates. * fix style * re-structure vision encoder and minor fixes. * fix parameter tying. * remove several unnecessary codes and fix small bugs. * update and fix init. * Update weight tying and other minor changes. * Skip `tie_word_embeddings` in config attributes check in T5Gemma2. * minor fix. * fix the inherence of t5gemma2decoderlayer * sync to head. * update decorator usage * disable FA and Flex due to merged module behavior * style --------- Co-authored-by: vasqu <antonprogamer@gmail.com>

References

#41834 - T5gemma2

Author

bzhangGo

Parents

11e0fc12

transformers 2375ddb4 - T5gemma2 (#41834)

transformers
2375ddb4 - T5gemma2 (#41834)