onnxruntime
Update QMoE kernel with optimizations
#26091
Open

Update QMoE kernel with optimizations #26091

apsonawane wants to merge 4 commits into main from asonawane/update
apsonawane
apsonawane apsonawane force pushed from 426e1ac7 to 509e17c4 138 days ago
apsonawane apsonawane requested a review from tianleiwu tianleiwu 138 days ago
tianleiwu
tianleiwu commented on 2025-09-23
tianleiwu
tianleiwu commented on 2025-09-23
tianleiwu
tianleiwu commented on 2025-09-23
apsonawane Fix merge conflicts
83ffc5d9
apsonawane Re-enable quantized Mlas
d399c66d
apsonawane apsonawane force pushed from 509e17c4 to 41133b7d 135 days ago
apsonawane Add overflow safety changes
8289fcb1
apsonawane apsonawane force pushed from 41133b7d to 8289fcb1 135 days ago
apsonawane Disable quantized Mlas, still not giving good tps
d57a7c3e

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone