openvino
f37a2e7b - Backport gather fix 2026.1 (#34907)

Commit
29 days ago
Backport gather fix 2026.1 (#34907) backport: https://github.com/openvinotoolkit/openvino/pull/34897 ### Description of the issue(symptom, root-cause, how it was resolved) #### Symptom Low similarity with granite-4.0-h-micro #### Root cause Fusing post-ops into rank-changing gather can generate incorrect index mapping, causing output mismatches. When gather has rank decrease (e.g., 5D->4D), static_canonicalize_shapes pads the output back to 5D by inserting dim=1 at the gather axis (e.g., {-1,64,64,128} -> {-1,1,64,64,128}). However, the fused eltwise peer tensor remains 4D. In the jitter, GetIdx selects index slots based on the peer tensor's rank (4D -> b,f,y,x), so the kernel's z loop variable - which iterates over actual data - is never used for peer indexing. This causes the fused eltwise to read incorrect data, as the f slot always maps to 0 (the padded dimension) instead of the actual data dimension. #### Resolution Disable gather fusion decrease rank from input to output. while keeping safe exceptions scalar eltwise cases. Although eltwise is the root cause this model, quantize as well due to potential issues. Gather eltwise post-op fusion in rank decrease | Post-op | Fusion| |-------------------------|:-----:| | Eltwise (scalar) | O | | Eltwise (per-channel) | X | | Eltwise (full-tensor) | X | #### Problematic graph Gather_4: in[1,2,64,64,128] -> out[1,64,64,128] + Multiply_27+Add_9 <img width="1597" height="1081" alt="image" src="https://github.com/user-attachments/assets/fa3afa2f-39af-4168-b281-0f988e37d3fe" /> #### Reproduction step and snapshot (if applicable. Do not attach for customer model) $ python ./tools/who_what_benchmark/whowhatbench/wwb.py --target-model /mnt/models/ov-share-13.iotg.sclab.intel.com/cv_bench_cache/WW11_llm-optimum_2026.1.0-21296/granite-4.0-h-micro/pytorch/ov/FP16 --gt-data /mnt/models/ov-share-04.iotg.sclab.intel.com/cv_bench_cache/AC_llm/wwb_ref_gt_data_cache/2026.1.0-21296-4589d335731_nat_ref/CPU_ICX/default_data_wwb/cache_nat_refs_cli/granite-4.0-h-micro__NAT/reference.csv --model-type text --genai --device GPU.1 --output ./wwb --verbose #### Checklist - [ ] Is it a proper fix? The fundamental FIX is to make peer rank the same as gather and process it. - [x] Did you include test case for this fix, if necessary? Yes - [x] Did you review existing test that can be extended to cover this scenario? Which test did you review? gather_fusion_test ### Tickets: - *CVS-183103* --------- Signed-off-by: hyunback <hyunback.kim@intel.com>
Author
Parents
Loading