onnxruntime
MultiheadAttention CUDA BF16 Support
#26083
Merged

MultiheadAttention CUDA BF16 Support #26083

nenad1002 merged 4 commits into main from nebanfic/mha-bf16
nenad1002
nenad1002 MHA BF16
6884d83d
nenad1002 Clean code
3930e267
nenad1002 nenad1002 marked this pull request as ready for review 100 days ago
nenad1002 nenad1002 requested a review from tianleiwu tianleiwu 99 days ago
tianleiwu
tianleiwu commented on 2025-09-23
nenad1002 use OrtCudaType
0891b14e
nenad1002 ci: trigger pipeline
3ef53ab8
nenad1002 nenad1002 requested a review from tianleiwu tianleiwu 94 days ago
tianleiwu
tianleiwu approved these changes on 2025-09-25
nenad1002 nenad1002 merged b7ae53f7 into main 93 days ago
nenad1002 nenad1002 deleted the nebanfic/mha-bf16 branch 93 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone