openvino
ec23e215 - [GPU] Dyn quan bugfix for cache (#32582)

Commit

213 days ago

[GPU] Dyn quan bugfix for cache (#32582) ### Description of the issue(symptom, root-cause, how it was resolved) - New attributes in dynamic quantization are missing from load/store in caching #### Reproduction step and snapshot (if applicable. Do not attach for customer model) - Two consecutive execution of WWB shows the issue on BMG - $rm -rf minicpm-1b-sft/pytorch/ov/OV_FP16-INT8_ASYM//model_cache/ ; python wwb.py --target-model minicpm-1b-sft/pytorch/ov/OV_FP16-INT8_ASYM/ --device gpu.1 --gt-data reference1.csv --num-sample 1 ; python wwb.py --target-model minicpm-1b-sft/pytorch/ov/OV_FP16-INT8_ASYM/ --device gpu.1 --gt-data reference1.csv --num-sample 1 #### Checklist - [x] Is it a proper fix? (not a workaround) - [x] Did you include test case for this fix, if necessary? - [x] Did you review existing test that can be extended to cover this scenario? Which test did you review?

References

#32582 - [GPU] Dyn quan bugfix for cache

Author

isanghao

Parents

4b3d6a7d

openvino ec23e215 - [GPU] Dyn quan bugfix for cache (#32582)

openvino
ec23e215 - [GPU] Dyn quan bugfix for cache (#32582)