Flash attention recompute #20603
flash attn recompute
45879ff5
use json file to pass recompute plans
3c374da6
fix
f822e7b1
pengwa
changed the title Flash attn recompute Flash attention recompute 2 years ago
fixes
4ee17c43
minor
11a15a0a
fix
36763054
fix build
ac44c6ce
fix win build
6b8120a2
fix win
53129178
fixes
20baf152
fix tests
c5cc3196
Merge branch 'main' of https://github.com/microsoft/onnxruntime into …
51110537
refinement
624adcd0
Merge branch 'main' of https://github.com/microsoft/onnxruntime into …
757ed236
fixes
f6ace9b8
restore the rng stage for CPU and CUDA
94d510f7
pengwa
dismissed their stale review
via 94d510f7
1 year ago
Merge branch 'main' of https://github.com/microsoft/onnxruntime into …
56426960
remove c++ test because it is hard to maintain it
9b94d4cc
minor
0e9d80ff
wschin
approved these changes
on 2024-05-21
Merge branch 'main' of https://github.com/microsoft/onnxruntime into …
31e8b97f
pengwa
merged
8a98874e
into main 1 year ago
pengwa
deleted the pengwa/flash_attn_recompute branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub