openvino
7a04d1d3 - [GPU] Fix reorder eltwise broadcast (#31221)

Commit
271 days ago
[GPU] Fix reorder eltwise broadcast (#31221) ### Description of the issue - symptom: Functional issue on dGPU(onednn) and accuracy issue on iGPU(cldnn execution). `Segmentation fault` `Exact Match accuracy : 77.87% Mean Iou Score: 0.7564` - root-cause: Target layer 'Add_9569' takes different dimension inputs with dynamic shape for NUMPY auto-broadcasting. ![layer_ir](https://github.com/user-attachments/assets/32b47fa0-4d32-4b9f-a8b1-478dc54c59f2) Currently, no logic for this NUMPY broadcasting implemented. add_required_reorder pass tried to add a Reorder layer from bfyx to b_fs_zyx_fsv16 which is not supported. ![CVS-169054](https://github.com/user-attachments/assets/0d539b0e-951b-41fa-9955-bb1527007df1) This caused both execution failure and accuracy issue. Also found Reorder kernel does not properly handle -1x-1x2 shape size for 5dims blocked format. - how resolved: When NUMPY auto-broadcating of Eltwise layer is coming for blocked format, then Reorder is not added to prevent invalid reordering of smaller tensor size to the blocked format. <img width="1872" height="570" alt="image" src="https://github.com/user-attachments/assets/d5581ea1-0d4d-425d-a8d4-20153a8743ac" /> static : <img width="1251" height="823" alt="image" src="https://github.com/user-attachments/assets/519d4ae4-fe6c-4a26-91f8-7dda1d2b8557" /> Result : `Exact Match Accuracy : 100.00% Mean IoU Score : 0.9979` #### The code and line that caused this issue - No target format supported : reorder_impls.cpp - Added logic into : primitive_inst.cpp reorder_inputs.cpp #### Reproduction step and snapshot - Download attached package [06585866](https://intel-my.sharepoint.com/:f:/p/sahira_rizvi/Eqfmh_PV9EZJnlmGsurqHrgBor5F4cXNvK-ZMkwp9ylFqg?e=AoycQy) from ticket `$ python yolov5/ov_detect.py` `$ python compare.py OV_CPU OV_GPU` #### Checklist - [x] Is it a proper fix? (not a workaround) - [x] Did you include test case for this fix, if necessary? - [x] Did you review existing test that can be extended to cover this scenario? Which test did you review? : reviewed unittests with "*broadcast_test_two_inputs_blocked_format*" and "align_shape_for_numpy_broadcast_test" ### Tickets: - *169054*, *171172* --------- Signed-off-by: Min, Byungil <byungil.min@intel.com>
Author
Parents
Loading