openvino
[GPU] Fix fp16 intermediate overflow in fc_bf_tiled DQ scale path
#35228
Merged

[GPU] Fix fp16 intermediate overflow in fc_bf_tiled DQ scale path #35228

ahnyoung-paul
ahnyoung-paul ahnyoung-paul added category: GPU
ahnyoung-paul ahnyoung-paul added pr: needs tests
ahnyoung-paul ahnyoung-paul added do_not_review
ahnyoung-paul ahnyoung-paul added under_perf_check
ahnyoung-paul ahnyoung-paul added do_not_merge
ahnyoung-paul ahnyoung-paul requested a review 55 days ago
ahnyoung-paul ahnyoung-paul requested a review 55 days ago
ahnyoung-paul ahnyoung-paul force pushed from b9cd5b8e to 68a567c3 55 days ago
ahnyoung-paul ahnyoung-paul force pushed from 68a567c3 to 25a1ee1f 55 days ago
ahnyoung-paul ahnyoung-paul removed pr: needs tests
ahnyoung-paul ahnyoung-paul removed pr: needs tests
ahnyoung-paul ahnyoung-paul removed do_not_review
ahnyoung-paul ahnyoung-paul removed do_not_review
ahnyoung-paul ahnyoung-paul removed do_not_merge
ahnyoung-paul ahnyoung-paul removed do_not_merge
ahnyoung-paul ahnyoung-paul removed under_perf_check
jade-cho
jade-cho approved these changes on 2026-04-14
isanghao
isanghao commented on 2026-04-14
isanghao
isanghao commented on 2026-04-14
ahnyoung-paul [GPU] Fix fp16 overflow issue for FC bf tiled kernel
f6e5a950
ahnyoung-paul fix unit test
17e2d34f
ahnyoung-paul optimize fc bf titled kernel
bd208037
ahnyoung-paul fix(GPU): reorder dq scale multiplication to match description
ff939f98
ahnyoung-paul fix(GPU): reorder dq scale multiplication without float tmp
8bb137e8
ahnyoung-paul fix(GPU): use float tmp for non-INT8 dq scale in fc_bf_tiled
b62fd64c
ahnyoung-paul ahnyoung-paul force pushed from 0721ca5a to b62fd64c 49 days ago
ahnyoung-paul remove redundant unit test
6b84f81b
isanghao
isanghao commented on 2026-04-15
ahnyoung-paul ahnyoung-paul changed the title Fix fp16 overflow fc bf tiled [GPU] Fix fp16 intermediate overflow in fc_bf_tiled DQ scale path 48 days ago
isanghao
isanghao approved these changes on 2026-04-16
ahnyoung-paul simplify non-INT8 DQ scale computation to single expression
3da775e6
ahnyoung-paul ahnyoung-paul force pushed from 48300c2d to 3da775e6 48 days ago
isanghao isanghao enabled auto-merge 47 days ago
isanghao isanghao merged 0f352d09 into master 47 days ago
isanghao isanghao deleted the fix_fp16_overflow_fc_bf_tiled branch 47 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone