openvino
25f0caf2 - [GPU] Update DisableFP16CompForGemma3RMSPattern for gemma-3b-it fp16 model (#33197)

Commit
144 days ago
[GPU] Update DisableFP16CompForGemma3RMSPattern for gemma-3b-it fp16 model (#33197) ### Description of the issue(symptom, root-cause, how it was resolved) - gemma-3-4b-it FP16 has accuracy degradation after PR#32414 (wwb similarity 0.885484 -> 0.10155) - DisableFP16CompForGemma3RMSPattern is not matched on gemma-3-4b-it FP16 and output ClampFP16 was removed by the PR. It causes fp16 range overflow(inf) and nan in result. - The gemma-3-4b-it FP16 has convert for rms input1 while DisableFP16CompForGemma3RMSPattern expects constant. - Add convert in rms input1 check in DisableFP16CompForGemma3RMSPattern #### The code and line that caused this issue (if it is not changed directly) - https://github.com/openvinotoolkit/openvino/blob/ca968e4d115a45ae0e12ae5cfce7835080691c60/src/plugins/intel_gpu/src/plugin/transformations/disable_fp16_comp_rms.cpp#L22 #### Reproduction step and snapshot (if applicable. Do not attach for customer model) - $ benchmark_app -d GPU.1 -m WW45_llm-optimum_2025.4.0-20381-RC1/gemma-3-4b-it/pytorch/ov/FP16/openvino_language_model.xml -data_shape inputs_embeds[1,15,2560],attention_mask[1,15],position_ids[1,15],token_type_ids[1,15],beam_idx[1] - ... #### Problematic graph - <graph before DisableFP16CompForGemma3RMSPattern pass> <img width="1383" height="788" alt="image" src="https://github.com/user-attachments/assets/3db6dc3b-264d-4554-96f9-283678d89c77" /> #### Checklist - [x] Is it a proper fix? (not a workaround) - [x] Did you include test case for this fix, if necessary? - [x] Did you review existing test that can be extended to cover this scenario? Which test did you review? ### Tickets: - 177319
Author
Parents
Loading