openvino
3926a153 - Fix sdpa_micro which caused accuracy issue of long-prompt wwb (#34291)

Commit
85 days ago
Fix sdpa_micro which caused accuracy issue of long-prompt wwb (#34291) ### Description of the issue - WWB accuracy test with long prompts showed low similarity on multiple platforms. <img width="632" height="266" alt="long-wwb" src="https://github.com/user-attachments/assets/abd82208-66a3-423d-a225-308ddf1122a6" /> #### The code and line that caused this issue - When WWB is executed by MIXED stagemode on the target platforms, sdpa config is generated by choose_config_xe2 in sdpa_ge_micro.cpp. This refers to invalid values of 'wg_n_kq' and 'wg_n_vs' which are pre-defined. #### Reproduction step and snapshot - Reproduced by benchmark on dg2-512 or on MTL `python benchmark.py -t text_gen -d GPU.1 -m llama-3.2-3b-instruct/pytorch/ov/OV_FP16-INT8_ASYM -pf long-prompt.jsonl --cb_config "{\"enable_prefix_caching\":true}"` - cb-config option is required to enable MIXED stage-mode. - sample long prompt is attached to the ticket [long-prompt.txt](https://github.com/user-attachments/files/25690220/long-prompt.txt) #### Checklist - [x] Is it a proper fix? - [x] Did you include test case for this fix, if necessary? - [x] Did you review existing test that can be extended to cover this scenario? Passed llm_bench ### Tickets: - CVS-181748, CVS-181751, CVS-181752 ### AI Assistance: - *AI assistance used: no --------- Signed-off-by: Min, Byungil <byungil.min@intel.com>
Author
Parents
Loading