transformers
3f860dba - Fix mask slicing for models with HybridCache (#35681)

Commit
1 year ago
Fix mask slicing for models with HybridCache (#35681) * correctly slice * check mask * Update modular_gemma2.py * fix * add tests * fix typo * finally fix mask slicing * Finally correctly slice in all cases!! * add test for all attention functions * small fix in tests * trick around dynamo tracing issue * last update * more robust * kwargs propagation * make it explicit for checkpointing * apply modular
Author
Parents
Loading