transformers
3f860dba - Fix mask slicing for models with HybridCache (#35681)

Commit

1 year ago

Fix mask slicing for models with HybridCache (#35681) * correctly slice * check mask * Update modular_gemma2.py * fix * add tests * fix typo * finally fix mask slicing * Finally correctly slice in all cases!! * add test for all attention functions * small fix in tests * trick around dynamo tracing issue * last update * more robust * kwargs propagation * make it explicit for checkpointing * apply modular

References

#35681 - Fix mask slicing for models with HybridCache

Author

Cyrilvallez

Parents

b764c20b

transformers 3f860dba - Fix mask slicing for models with HybridCache (#35681)

transformers
3f860dba - Fix mask slicing for models with HybridCache (#35681)