onnxruntime
Implement FlashAttention for CPU
#20805
Merged

Implement FlashAttention for CPU #20805

yufenglee merged 36 commits into microsoft:main from duanqn:qiduan/flash
duanqn
duanqn
duanqn commented on 2024-05-27
duanqn
duanqn commented on 2024-05-27
github-advanced-security
github-advanced-security commented on 2024-05-28
duanqn duanqn force pushed from db5fcb26 to f7235b30 1 year ago
tianleiwu
tianleiwu commented on 2024-06-13
tianleiwu
tianleiwu commented on 2024-06-13
tianleiwu
tianleiwu commented on 2024-06-13
tianleiwu
tianleiwu commented on 2024-06-14
tianleiwu
tianleiwu commented on 2024-06-14
duanqn duanqn force pushed from 37d23258 to c8c12fff 1 year ago
tianleiwu
tianleiwu commented on 2024-06-18
duanqn
duanqn commented on 2024-06-19
duanqn
github-advanced-security
github-advanced-security commented on 2024-06-19
tianleiwu
tianleiwu commented on 2024-06-19
duanqn duanqn force pushed from c2da456a to f8584305 1 year ago
duanqn duanqn force pushed from f8584305 to 599ac3f3 1 year ago
tianleiwu tianleiwu marked this pull request as ready for review 1 year ago
tianleiwu tianleiwu requested a review 1 year ago
tianleiwu
tianleiwu commented on 2024-06-21
duanqn duanqn force pushed from 7b82ac51 to 60e22806 1 year ago
duanqn
duanqn commented on 2024-06-21
duanqn
tianleiwu
tianleiwu
tianleiwu
azure-pipelines
azure-pipelines
azure-pipelines
tianleiwu
github-advanced-security
github-advanced-security commented on 2024-06-21
tianleiwu
tianleiwu commented on 2024-06-22
tianleiwu
tianleiwu commented on 2024-06-22
duanqn
tianleiwu
tianleiwu
tianleiwu
azure-pipelines
azure-pipelines
azure-pipelines
tianleiwu tianleiwu changed the title [WIP] Implement FlashAttention for CPU Implement FlashAttention for CPU 1 year ago
tianleiwu
tianleiwu commented on 2024-06-24
duanqn duanqn force pushed from 57eba5fd to 7b130037 1 year ago
tianleiwu
tianleiwu
tianleiwu
azure-pipelines
azure-pipelines
azure-pipelines
toothache
toothache approved these changes on 2024-06-25
tianleiwu
tianleiwu commented on 2024-06-26
tianleiwu
tianleiwu
tianleiwu
tianleiwu
azure-pipelines
azure-pipelines
azure-pipelines
tianleiwu
tianleiwu dismissed these changes on 2024-06-28
tianleiwu
tianleiwu
tianleiwu
azure-pipelines
azure-pipelines
azure-pipelines
toothache
yufenglee
yufenglee commented on 2024-07-03
yufenglee
yufenglee commented on 2024-07-03
yufenglee
yufenglee commented on 2024-07-03
yufenglee
yufenglee commented on 2024-07-03
yufenglee
yufenglee commented on 2024-07-03
duanqn duanqn dismissed their stale review via 5a96b44e 1 year ago
tianleiwu
azure-pipelines
tianleiwu
tianleiwu
azure-pipelines
azure-pipelines
tianleiwu
tianleiwu
tianleiwu
azure-pipelines
azure-pipelines
azure-pipelines
tianleiwu
tianleiwu commented on 2024-07-09
tianleiwu
tianleiwu commented on 2024-07-09
tianleiwu
azure-pipelines
duanqn Register new contrib op FlashAttention
37a31756
duanqn Move getenv to constructor
4ebe4546
duanqn Get Env
0d65ce26
duanqn Renaming
42b2acb8
duanqn Check for T==float
88a2600c
duanqn Lintrunner
53e2e851
duanqn Remove mlas function
945f656a
duanqn Handle scale; Require present_k and present_v to be empty
ee323fb4
duanqn Check is_unidirectional_
63e76ad0
tianleiwu fix build
3d6368b3
duanqn Merge with mlas.h
1fba73a3
duanqn Add comment and MLASCALL
9479623e
duanqn Remove unnecessary change
1e63e825
duanqn Fix onnxruntime_mlas.cmake
1fd0813d
duanqn Pick onnxruntime/test/python/transformers/benchmark_mha.py from lates…
afb74661
duanqn Disable FlashAttention by default
327b4c2e
duanqn Fix value choice of row_size_q and row_size_kv; Add comments
ab0da5b1
duanqn Fix order
8b190947
duanqn causal=False
8b2270a2
duanqn Add MLASCALL on implementation
b449524e
duanqn Improve comment
06251b1a
duanqn Enable FlashAttention by default
27b18d43
duanqn lintrunner -a
3059b44a
duanqn Remove memset
412f219b
duanqn Fix l2_cache_size_
44ff8f0a
duanqn Fix PREfast
54213354
duanqn #include <algorithm>
03d8f363
duanqn Fix bug
d63e528b
duanqn lintrunner
7a3d4a6c
duanqn Renaming
bf014d0b
duanqn Renaming
baff456a
duanqn Use MlasSgemmOperation
72f3c677
duanqn Move threading inside MLAS kernel
e8a4373c
Remove MLASCALL
46a8ce91
duanqn Remove 1 TODO
e1cf2890
duanqn Renaming
852fd98d
duanqn duanqn force pushed from 5554531b to 852fd98d 1 year ago
tianleiwu
tianleiwu
tianleiwu
tianleiwu
tianleiwu approved these changes on 2024-07-11
azure-pipelines
azure-pipelines
azure-pipelines
tianleiwu tianleiwu requested a review from yufenglee yufenglee 1 year ago
yufenglee
yufenglee approved these changes on 2024-07-11
yufenglee
yufenglee yufenglee merged 80b56feb into main 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone