transformers
Kernels flash attn
#39474
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
46
Changes
View On
GitHub
Kernels flash attn
#39474
ArthurZucker
merged 46 commits into
main
from
kernels-flash-attn
use partial to wrap around `transformers` utils!
cd4c7cb5
try to refactor?
005f4821
revert one wrong change
1b834a4d
just a nit
d93f366e
push
2b7d411d
reverter watever was wrong!
affba20d
some nits
1959eb28
fixes when there is no attention mask
888cd402
Merge branch 'main' of github.com:huggingface/transformers into kerne…
8f5e62b9
bring the licence back
5a7ae113
some fixes
c57673bb
nit
7d69d834
Merge branch 'kernels-flash-attn' of github.com:huggingface/transform…
7e94910d
style
112e2a64
remove prints
501aa7ea
correct dtype
04088bec
ArthurZucker
added
Flash Attention
fa flags for testing
b1e104b0
update
7087e7b8
Merge branch 'main' into kernels-flash-attn
cc58aca3
use paged attention if requested!
6a2996a0
Merge branch 'kernels-flash-attn' of github.com:huggingface/transform…
8ddc5252
updates
a5862941
a clone was needed, not sure why
57842f56
automatically create cu seq lens when input is flash, this at least m…
43b7f322
simplify and improve?
12bad1b3
flash attention is kinda broken on recent cuda version so allow the o…
c0b600a5
ArthurZucker
force pushed
from
d35eb475
to
c0b600a5
205 days ago
Merge branch 'main' into kernels-flash-attn
5c648746
kadirnar
commented on 2025-07-21
fix!
11e50001
protect kernels import
1c073509
MekkCyber
approved these changes on 2025-07-21
Cyrilvallez
commented on 2025-07-22
update
cdaa1eb6
properly parse generation config being passed
767d5852
Merge branch 'kernels-flash-attn' of github.com:huggingface/transform…
10f866e1
revert and update
c75c5398
add two tests
a2f3126e
Merge branch 'main' of github.com:huggingface/transformers into kerne…
63b01c31
some fixes
85829d7d
fix test FA2
56981a55
takes comment into account
b3f7a49e
fixup
21e07f77
revert changes
a8b7ec65
revert the clone, it is only needed because the metal kernel is not d…
f111d338
[docs] update attention implementation and cache docs (#39547)
cd98c1fe
fix mps on our side for now
f457a085
ArthurZucker
commented on 2025-07-22
Update src/transformers/integrations/flash_paged.py
38d241b4
Merge branches 'main' and 'kernels-flash-attn' of github.com:huggingf…
cb58187b
no qa
c0f4f099
ArthurZucker
enabled auto-merge (squash)
204 days ago
disabled auto-merge
204 days ago
Manually disabled by user
ArthurZucker
merged
efceeaf2
into main
204 days ago
ArthurZucker
deleted the kernels-flash-attn branch
204 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
MekkCyber
Cyrilvallez
kadirnar
Assignees
No one assigned
Labels
Flash Attention
Milestone
No milestone
Login to write a write a comment.
Login via GitHub