onnxruntime
[CPU] Improve QMoE kernel
#25822
Merged

[CPU] Improve QMoE kernel #25822

apsonawane merged 36 commits into main from asonawane/qmoe
apsonawane
apsonawane Fixes CPU kernel
adcdca72
apsonawane Additional fixes
541e08b8
apsonawane Optimizations
764b55a1
apsonawane Fix pipelines
27d05d5b
apsonawane Address comments
c1500b8d
apsonawane Address comments
85268ad7
apsonawane Revert "Address comments"
37c0858b
apsonawane Fix the memory optimization issue
85874ff1
apsonawane Fix race condition
1c9f927b
apsonawane Fix unused variables
f7746829
apsonawane Optimizations
728d7a88
apsonawane Fix
c2386f5b
apsonawane Debugging alot
a6da84db
apsonawane Remove comments
e2c5d689
apsonawane Some modifications
4c905ae8
apsonawane FC1 fixed
c3647589
apsonawane Working fix
ed52e130
apsonawane Remove print statements
1ea12bca
apsonawane Low diff values
f5be0cec
github-advanced-security
github-advanced-security commented on 2025-08-22
apsonawane Rebase with main
e450158b
apsonawane Fix
471bb8b1
apsonawane Fix tests
b015c3de
apsonawane Fix pipelines
2b674658
tianleiwu refactoring
f85a9f16
github-actions
github-actions commented on 2025-08-22
tianleiwu format
1bcb20d0
tianleiwu parallel optimization
25aa31bf
apsonawane
apsonawane commented on 2025-08-23
apsonawane
apsonawane commented on 2025-08-23
tianleiwu fix build
ca180b66
tianleiwu eliminate the intermediate memcpy after SwiGLU
6a484862
tianleiwu parallelize the routing logic
c369322e
github-actions
github-actions commented on 2025-08-23
github-advanced-security
github-advanced-security commented on 2025-08-23
tianleiwu format
73a437c4
tianleiwu refactoring output
94a27297
apsonawane Fix pipelines
5de1b217
apsonawane Update cpu tests to use same python reference implementation as cuda …
27c1c055
apsonawane apsonawane force pushed from a39580c5 to 27c1c055 211 days ago
github-advanced-security
github-advanced-security commented on 2025-08-25
apsonawane Fix tests
81e6713a
apsonawane Remove failing CPU test
d11f51cf
tianleiwu
tianleiwu commented on 2025-08-26
apsonawane Add legacy shape check back
a7978f88
apsonawane apsonawane force pushed from c9cdf689 to a7978f88 210 days ago
tianleiwu
tianleiwu approved these changes on 2025-08-26
apsonawane apsonawane marked this pull request as ready for review 210 days ago
apsonawane apsonawane merged db4b0f4e into main 210 days ago
apsonawane apsonawane deleted the asonawane/qmoe branch 210 days ago
apsonawane apsonawane added cherry-picked

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone