openvino
203aaddf - [NPUW] Fix MoE model builder for NPU compatibility

Commit

33 days ago

[NPUW] Fix MoE model builder for NPU compatibility - Router weights use i8 (not user-specified i4/nf4), matching real GPT-OSS two-pass quantization where router is excluded from 4-bit - Use i32 for MoE shape/axis/step constants matching HuggingFace export - 3D expert scales [batch, rows, 1] for DEVICE_ROUTED expert detection - Router nodes named .mlp.expert.router.* for DEVICE_ROUTED transform Related: EISW-209517

Author

dylanneve1

Committer

dylanneve1

Parents

e401972d

openvino 203aaddf - [NPUW] Fix MoE model builder for NPU compatibility

openvino
203aaddf - [NPUW] Fix MoE model builder for NPU compatibility