openvino
4135e538 - [MOE] Fix issues with batch, use postponed_const when fusing expert weights (#32546)

Commit

209 days ago

[MOE] Fix issues with batch, use postponed_const when fusing expert weights (#32546) ### Details: - *Use postponed constant instead make try fold to limit memory usage when folding large concatenation* - *Fix issue where inference would fail in batch sizes different than 1 due to incorrect reshape target* ### Tickets: - *ticket-id*

References

#32546 - [MOE] Fix issues with batch, use postponed_const when fusing expert weights

Author

mmikolajcz

Parents

b32be6d1

openvino 4135e538 - [MOE] Fix issues with batch, use postponed_const when fusing expert weights (#32546)

openvino
4135e538 - [MOE] Fix issues with batch, use postponed_const when fusing expert weights (#32546)