openvino
203aaddf - [NPUW] Fix MoE model builder for NPU compatibility

Commit
33 days ago
[NPUW] Fix MoE model builder for NPU compatibility - Router weights use i8 (not user-specified i4/nf4), matching real GPT-OSS two-pass quantization where router is excluded from 4-bit - Use i32 for MoE shape/axis/step constants matching HuggingFace export - 3D expert scales [batch, rows, 1] for DEVICE_ROUTED expert detection - Router nodes named .mlp.expert.router.* for DEVICE_ROUTED transform Related: EISW-209517
Author
Committer
Parents
Loading