onnxruntime
[CUDA] BF16 MoE and qMoE
#25572
Merged

[CUDA] BF16 MoE and qMoE #25572

kunal-vaishnavi merged 5 commits into main from tlwu/qmoe_bfloat16
tianleiwu
tianleiwu add bf16
f32cc1f2
tianleiwu tianleiwu marked this pull request as draft 203 days ago
tianleiwu update op doc
7e145bd8
tianleiwu update test
d88cff6b
tianleiwu remove unused test file
ce964f98
tianleiwu pipeline mode
b2317806
tianleiwu tianleiwu marked this pull request as ready for review 201 days ago
tianleiwu tianleiwu requested a review from apsonawane apsonawane 200 days ago
tianleiwu tianleiwu requested a review from kunal-vaishnavi kunal-vaishnavi 200 days ago
kunal-vaishnavi
kunal-vaishnavi commented on 2025-07-31
kunal-vaishnavi
kunal-vaishnavi commented on 2025-07-31
kunal-vaishnavi
kunal-vaishnavi commented on 2025-07-31
kunal-vaishnavi
kunal-vaishnavi approved these changes on 2025-07-31
kunal-vaishnavi kunal-vaishnavi merged 68b9d9bf into main 200 days ago
kunal-vaishnavi kunal-vaishnavi deleted the tlwu/qmoe_bfloat16 branch 200 days ago
jywu-msft jywu-msft added release:1.23.0
tianleiwu tianleiwu added cherry-picked
tianleiwu tianleiwu removed release:1.23.0

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone