transformers
Fix llama model sdpa attention forward function masking bug when output_attentions=True
#30652

Merged

Fix llama model sdpa attention forward function masking bug when output_attentions=True #30652

ArthurZucker merged 14 commits into huggingface:main from Aladoro:fix-llama-mask-output-attn

Fix llama model forward function with attention=True, same-length enc…

514c1c32

Fix style

6c0fa7bb

propagate fix to modeling_cohere, gemma, dbrx, and olmo (which copy t…

0d91bea9

Fix style

894c14b8

ArthurZucker commented on 2024-05-06

ignore unnecessary sdpa mask converter when output_attentions=True

25843087

Merge branch 'huggingface:main' into fix-llama-mask-output-attn

c7bdc95d

Merge branch 'huggingface:main' into fix-llama-mask-output-attn

8d793a38

ArthurZucker commented on 2024-05-09

Merge branch 'huggingface:main' into fix-llama-mask-output-attn

fc143acf

add tests checking sdpa and eager outputs match when output_attention…

3e0fada2

ArthurZucker commented on 2024-05-15

Split if statements in two lines

9acc1190

Merge branch 'huggingface:main' into fix-llama-mask-output-attn

08dbd4bb

ArthurZucker approved these changes on 2024-05-15

Fix formatting

9b79aee3

Add fix to new jetmoe model

dd699233

Add missing output_attentions argument to jetmoe mask creation

ad4aded6

ArthurZucker merged 4b3eb19f into main 1 year ago

Aladoro deleted the fix-llama-mask-output-attn branch 1 year ago

Reviewers

ArthurZucker

Assignees

No one assigned

Labels

None yet

Milestone

No milestone