openvino
134cd5f2 - [GPU] Fix onednn concat validation for non-block-aligned feature in blocked formats (#34506)

Commit
54 days ago
[GPU] Fix onednn concat validation for non-block-aligned feature in blocked formats (#34506) ### Details: - Fix NaN output in onednn concat layer when input feature dimension is not aligned to the block size in blocked memory formats. ### Description of the issue(symptom, root-cause, how it was resolved) - **Symptom**: TF_Separate_Bass model produces NaN values on GPU with FP16 precision at concat:Transpose_125956059 layer inside Loop sub-graph. Two clean inputs [2,24,16,256] with f16 b_fs_yx_fsv16 format are concatenated along feature axis to [2,48,16,256], but output contain more than 8 hundred NaN values. - **Root Cause**: - The concat layer's two inputs have feature=24 in b_fs_yx_fsv16 (block size=16) format, where 24 % 16 != 0 (not block-aligned) - The validate_impl() in concatenation_onednn.hpp checks output feature alignment (is_feature_aligned(out_layout)) but does not check input feature alignment - Output feature 48 is aligned (48 % 16 == 0), so the check passes, and onednn concat is selected - The onednn concat kernel has a bug handling non-block-aligned input features in blocked formats, causing data corruption at block boundaries - Static models are not affected: build-time allocation always zero-fills padding, so padding is safe. - **Resolution**: - In concatenation_onednn.hpp validate_impl(), add ` if (node.is_dynamic() && !is_feature_aligned(in_layout))` check for all input layouts inside the dependency loop, consistent with the existing output layout check - This ensures onednn concat is rejected only when the combination of dynamic memory reuse (no zero-fill) and non-block-aligned input features would produce incorrect results. Static models retain the onednn path and are unaffected performance-wise. - When onednn is rejected for non-block-aligned inputs, the framework falls back to OCL concat which correctly handles this case - Added unit test concat_gpu_onednn.dynamic_non_block_aligned_feature to verify the fix #### The code and line that caused this issue (if it is not changed directly) https://github.com/openvinotoolkit/openvino/blob/81bb2f9d63fefa933a5aec40a6560364bb392a2b/src/plugins/intel_gpu/src/graph/impls/onednn/concatenation_onednn.hpp#L87-L100 #### Reproduction step and snapshot (if applicable. Do not attach for customer model) python -m pytest test_ovc_mo.py \ -n 2 \ --tb=native \ --env_conf=.automation/env_config.yml \ --test_conf=.automation/test_configs/desktop_test_config_gpu_llm.yml \ -m "not launch_only_if_manually_specified" \ --pregen_irs=models/irs_mapping.csv \ --tf_models_version=1.15.2 \ --modules pipelines/production/tf/light \ -k "TF_Ssd_Inception_v2_coco_api_2_True" \ --dynamism_type=None \ --log-cli-level INFO #### Problematic graph - Original IR <img width="2761" height="1138" alt="image" src="https://github.com/user-attachments/assets/3c005f57-f204-483a-96c5-bed751ca56ff" /> - Current IR <img width="3157" height="1123" alt="image" src="https://github.com/user-attachments/assets/3650ab8d-987c-4598-91c0-dcacf7b046bb" /> #### Checklist - [v] Is it a proper fix? (not a workaround) - [v] Did you include test case for this fix, if necessary? - [v] Did you review existing test that can be extended to cover this scenario? Which test did you review? ### Tickets: - *CVS-181149* --------- Signed-off-by: zhanmyz <yazhan.ma@intel.com>
Author
Parents
Loading