onnxruntime
Update qMoE spec to support block quantization
#25641
Merged

Update qMoE spec to support block quantization #25641

tianleiwu merged 4 commits into main from tlwu/block_wise_qmoe
tianleiwu
tianleiwu update qMoE to support block_size
432ed683
tianleiwu tianleiwu force pushed from 8b9e59be to 432ed683 240 days ago
tianleiwu fix ,
dec42f67
tianleiwu format
f209bb55
tianleiwu update doc
b532a767
tianleiwu tianleiwu requested a review from kunal-vaishnavi kunal-vaishnavi 240 days ago
kunal-vaishnavi
kunal-vaishnavi approved these changes on 2025-08-04
tianleiwu tianleiwu merged 59871e3b into main 239 days ago
tianleiwu tianleiwu deleted the tlwu/block_wise_qmoe branch 239 days ago
kunal-vaishnavi kunal-vaishnavi added release:1.23.2
apsonawane apsonawane removed release:1.23.2
apsonawane apsonawane added cherry-picked
apsonawane

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone