onnxruntime
[CUDA] Fuse add bias and transpose into one kernel in Attention
#12670
Merged

[CUDA] Fuse add bias and transpose into one kernel in Attention #12670

tianleiwu merged 2 commits into main from tlwu/bert_bias_transpose
tianleiwu
tianleiwu fuse add bias and transpose in attention
d2680c5e
tianleiwu tianleiwu requested a review from yufenglee yufenglee 3 years ago
tianleiwu tianleiwu requested a review from wangyems wangyems 3 years ago
tianleiwu tianleiwu marked this pull request as draft 3 years ago
tianleiwu tianleiwu changed the title Fuse add bias and transpose into one kernel in Attention [CUDA] Fuse add bias and transpose into one kernel in Attention 3 years ago
tianleiwu format
933ee8dc
tianleiwu tianleiwu marked this pull request as ready for review 3 years ago
wangyems
wangyems approved these changes on 2022-08-22
tianleiwu tianleiwu merged 8d78f96d into main 3 years ago
tianleiwu tianleiwu deleted the tlwu/bert_bias_transpose branch 3 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone