onnxruntime
[CUDA] BF16 MoE and qMoE
#25572
Merged

[CUDA] BF16 MoE and qMoE #25572

kunal-vaishnavi merged 5 commits into main from tlwu/qmoe_bfloat16
tianleiwu
tianleiwu add bf16
f32cc1f2
tianleiwu tianleiwu marked this pull request as draft 233 days ago
tianleiwu update op doc
7e145bd8
tianleiwu update test
d88cff6b
tianleiwu remove unused test file
ce964f98
tianleiwu pipeline mode
b2317806
tianleiwu tianleiwu marked this pull request as ready for review 231 days ago
tianleiwu tianleiwu requested a review from apsonawane apsonawane 230 days ago
tianleiwu tianleiwu requested a review from kunal-vaishnavi kunal-vaishnavi 230 days ago
kunal-vaishnavi
kunal-vaishnavi commented on 2025-07-31
kunal-vaishnavi
kunal-vaishnavi commented on 2025-07-31
kunal-vaishnavi
kunal-vaishnavi commented on 2025-07-31
kunal-vaishnavi
kunal-vaishnavi approved these changes on 2025-07-31
kunal-vaishnavi kunal-vaishnavi merged 68b9d9bf into main 230 days ago
kunal-vaishnavi kunal-vaishnavi deleted the tlwu/qmoe_bfloat16 branch 230 days ago
jywu-msft jywu-msft added release:1.23.0
tianleiwu tianleiwu added cherry-picked
tianleiwu tianleiwu removed release:1.23.0

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone