transformers
342e3f9f - Fix sliding window attention used in Gemma2FlashAttention2 (#32522)

Commit
1 year ago
Fix sliding window attention used in Gemma2FlashAttention2 (#32522) * fix sliding window attention (flash2) in gemma2 model * [run-slow] gemma * fix slicing attention_mask for flash_attn2 * fix slicing attention_mask when flash_attn is used * add missing comment * slice the last seq_len tokens in the key, value states * revert code of slicing key, value states
Author
Parents
Loading