onnxruntime
[CUDA] enable causal in MultiHeadAttention
#21852
Merged

[CUDA] enable causal in MultiHeadAttention #21852

tianleiwu merged 2 commits into main from tlwu/mha_causal
tianleiwu
tianleiwu causal MHA
8ae33e06
tianleiwu tianleiwu marked this pull request as draft 1 year ago
tianleiwu update mha test
d6f3378c
tianleiwu tianleiwu marked this pull request as ready for review 1 year ago
tianleiwu tianleiwu requested a review from wangyems wangyems 1 year ago
tianleiwu tianleiwu requested a review from kunal-vaishnavi kunal-vaishnavi 1 year ago
tianleiwu tianleiwu requested a review from yufenglee yufenglee 1 year ago
kunal-vaishnavi
wangyems
wangyems approved these changes on 2024-08-26
tianleiwu
kunal-vaishnavi
kunal-vaishnavi approved these changes on 2024-08-26
tianleiwu tianleiwu merged ad382120 into main 1 year ago
tianleiwu tianleiwu deleted the tlwu/mha_causal branch 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone