Add memory efficient attention from CUTLASS #14343
Add memory efficient attention from cutlass
6873fb27
tianleiwu
marked this pull request as draft 3 years ago
fix build errors
7f839113
Not compute cu_seqlens when no mask
cfa85596
update patch to fix build error
4f0664ef
not use patch file
2d07854d
remove "si" that declared but never referenced
bde4ae48
tianleiwu
marked this pull request as ready for review 3 years ago
Review feedback
bce7894e
split to multiple cu files to speed up build
4a003fdc
Merge branch 'main' into tlwu/cutlass_memory_efficient_attention
9cb53ca6
not enable two fused backend
7ebe107a
wangyems
approved these changes
on 2023-01-20
tianleiwu
merged
414b012f
into main 3 years ago
tianleiwu
deleted the tlwu/cutlass_memory_efficient_attention branch 3 years ago
faxu
added triage:approved
faxu
removed release:1.14
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub