onnxruntime
[CUDA] Fuse add bias and transpose into one kernel in Attention
#12670

Merged

[CUDA] Fuse add bias and transpose into one kernel in Attention #12670

tianleiwu merged 2 commits into main from tlwu/bert_bias_transpose

fuse add bias and transpose in attention

d2680c5e

tianleiwu requested a review from

yufenglee 3 years ago

tianleiwu requested a review from

wangyems 3 years ago

tianleiwu marked this pull request as draft 3 years ago

tianleiwu changed the title ~~Fuse add bias and transpose into one kernel in Attention~~ [CUDA] Fuse add bias and transpose into one kernel in Attention 3 years ago

format

933ee8dc

tianleiwu marked this pull request as ready for review 3 years ago

wangyems approved these changes on 2022-08-22

tianleiwu merged 8d78f96d into main 3 years ago

tianleiwu deleted the tlwu/bert_bias_transpose branch 3 years ago

Reviewers

wangyems

yufenglee

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

onnxruntime [CUDA] Fuse add bias and transpose into one kernel in Attention #12670 Merged

[CUDA] Fuse add bias and transpose into one kernel in Attention #12670

onnxruntime
[CUDA] Fuse add bias and transpose into one kernel in Attention
#12670

Merged