openvino
f8efd350 - [LPT] FQStripping transformation rework (#33989)

Commit

40 days ago

[LPT] FQStripping transformation rework (#33989) ### Details: Some INT16 models rely on U16/I16 FakeQuantize layers. Simply stripping these FakeQuantize operations may be insufficient when such models are executed in f16 precision, because the original (unquantized) activation values flowing through the stripped path may exceed the representable f16 range. This can lead to overflow and, consequently, incorrect inference results. This PR introduces a new mechanism called `ScaleAdjuster`. The `ScaleAdjuster` detects activation paths that feed into scale‑invariant nodes and safely reduces the magnitude of activation values to keep them within the f16 numeric range — without altering the model’s semantic correctness (so the adjustment is possible only for activations paths which reach scale-invariant nodes). The implementation is validated by: - GPU functional tests, ensuring inference correctness, and - LPT graph comparison tests, verifying structural consistency of transformations. ### Tickets: - *CVS-180573*

References

#33989 - [LPT] FQStripping transformation rework

Author

v-Golubev

Parents

416caa39

openvino f8efd350 - [LPT] FQStripping transformation rework (#33989)

openvino
f8efd350 - [LPT] FQStripping transformation rework (#33989)