vllm
[Kernel] Flash Attention 3 Support
#12093
Merged

[Kernel] Flash Attention 3 Support #12093

LucasWilkinson
github-actions
mergify mergify added ci/build
LucasWilkinson LucasWilkinson force pushed 1 year ago
LucasWilkinson LucasWilkinson force pushed 1 year ago
LucasWilkinson LucasWilkinson changed the title [WIP][Kernel] Flash Attention 3 Support [Kernel] Flash Attention 3 Support 1 year ago
LucasWilkinson LucasWilkinson marked this pull request as ready for review 1 year ago
LucasWilkinson LucasWilkinson requested a review from tlrmchlsmth tlrmchlsmth 1 year ago
LucasWilkinson LucasWilkinson requested a review from WoosukKwon WoosukKwon 1 year ago
LucasWilkinson LucasWilkinson requested a review from robertgshaw2-redhat robertgshaw2-redhat 1 year ago
LucasWilkinson LucasWilkinson requested a review from njhill njhill 1 year ago
LucasWilkinson LucasWilkinson requested a review from ywang96 ywang96 1 year ago
LucasWilkinson LucasWilkinson requested a review from comaniac comaniac 1 year ago
LucasWilkinson LucasWilkinson requested a review from alexm-redhat alexm-redhat 1 year ago
WoosukKwon
WoosukKwon approved these changes on 2025-01-22
mergify
mergify mergify added needs-rebase
robertgshaw2-redhat
LucasWilkinson build fa3 from vllm
4ec98938
LucasWilkinson v0 update to seqused_k
b3203983
LucasWilkinson add env var for controlling version
a84cdda1
LucasWilkinson update branch and python only build
71e5becb
LucasWilkinson fix mypy error
a48a9ad4
LucasWilkinson minor refactors
8d578cf6
LucasWilkinson codespell fix
234f6c19
LucasWilkinson update git hash
6d8e439f
LucasWilkinson update fa3
7f82709a
LucasWilkinson missing specify fa versions
ef4283b6
LucasWilkinson update fa3
509e9d85
LucasWilkinson add assert
34100161
LucasWilkinson fix tests + softmax
c5b6b4a2
LucasWilkinson fix building sm80 on H200
10b3cd2c
LucasWilkinson LucasWilkinson force pushed to 10b3cd2c 1 year ago
mergify mergify removed needs-rebase
mgoin
mgoin commented on 2025-01-22
LucasWilkinson disable fp8 for now
2f9015eb
LucasWilkinson fix building .cu files that are included by other .cu files
d577b108
LucasWilkinson fix glob
e8c2fe1c
LucasWilkinson cut down fa3 binary size
ded7cb6d
LucasWilkinson update vllm-flash-attn cmake
6ccb5b25
WoosukKwon WoosukKwon added ready
LucasWilkinson disable fa3 on 8.6 and 8.9
0c16227a
LucasWilkinson disable fa build on AMD
eafde62d
WoosukKwon
WoosukKwon commented on 2025-01-23
WoosukKwon WoosukKwon merged 978b45f3 into main 1 year ago
houseroad
LucasWilkinson

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone