transformers
Fix sliding window attention used in Gemma2FlashAttention2
#32522
Merged

Fix sliding window attention used in Gemma2FlashAttention2 #32522

brcps12
brcps12 fix sliding window attention (flash2) in gemma2 model
13cb6c08
amyeroberts
ArthurZucker
ArthurZucker commented on 2024-08-08
ArthurZucker
ArthurZucker commented on 2024-08-09
ArthurZucker ArthurZucker added run-slow
brcps12 [run-slow] gemma
e81fc78e
brcps12
brcps12 brcps12 requested a review from ArthurZucker ArthurZucker 1 year ago
ArthurZucker
ArthurZucker commented on 2024-08-12
brcps12 fix slicing attention_mask for flash_attn2
f1adb8a7
brcps12 fix slicing attention_mask when flash_attn is used
f912212d
brcps12 Merge branch 'main' into fixing-sliding-window-attn
42f5d0e1
brcps12 add missing comment
d3ae866a
brcps12
brcps12 slice the last seq_len tokens in the key, value states
73acbc18
brcps12 revert code of slicing key, value states
991534ee
ArthurZucker
ArthurZucker approved these changes on 2024-08-12
ArthurZucker ArthurZucker merged 342e3f9f into main 1 year ago
ArthurZucker
HuggingFaceDocBuilderDev
kiddj
ArthurZucker
ArthurZucker

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone