openvino
c4ce6e56 - [GPU] Fix OOB memory access in gemm_tiled_opt kernel for non-aligned tile dimensions (#34482)

Commit

37 days ago

[GPU] Fix OOB memory access in gemm_tiled_opt kernel for non-aligned tile dimensions (#34482) ### Description of the issue(symptom, root-cause, how it was resolved) - The original gemm_tiled_opt kernel uses BLOCK_READ_B (sub-group block reads) to load B matrix tiles, which always reads SIMD_WIDTH × B_VEC_SIZE contiguous elements. When the N dimension is not evenly divisible by the tile size (TILE_N), the last tile group along N extends beyond the allocated buffer boundary, causing an out-of-bounds memory access (CL_OUT_OF_RESOURCES). The same issue applies to BLOCK_READ_A in the static K-leftover path when K is not aligned to TILE_K. - Add boundary checks for BLOCK_READ operations in gemm_tiled_opt.cl to prevent CL_OUT_OF_RESOURCES errors when matrix dimensions are not aligned to tile sizes. Changes: - Add tile_n_offset bounds check before BLOCK_READ_B in dynamic and static paths (main loop and K-leftover sections) - Add K dimension bounds check before BLOCK_READ_A in static K-leftover section - Guard static path checks with #if TILE_N_NOT_DIVISIBLE to ensure zero overhead for tile-aligned shapes - Add regression test for real model shape (MatMul_147904: M=128, K=1025, N=199, batch=32) #### The code and line that caused this issue (if it is not changed directly) - src/plugins/intel_gpu/src/kernel_selector/cl_kernels/gemm_tiled_opt.cl #### Reproduction step and snapshot (if applicable. Do not attach for customer model) - $ ./benchmark_app -d GPU -m ~/cvs173214/emb.xml -hint none -nstreams 1 -nireq 1 -niter 1 -infer_precision f32 #### Problematic graph - <img width="617" height="456" alt="image" src="https://github.com/user-attachments/assets/ad952234-be65-4b2b-afc0-89687e678f78" /> #### Checklist - [x] Is it a proper fix? (not a workaround) - [x] Did you include test case for this fix, if necessary? - [x] Did you review existing test that can be extended to cover this scenario? Which test did you review? ### Tickets: - 173214

References

#34482 - [GPU] Fix OOB memory access in gemm_tiled_opt kernel for non-aligned tile dimensions

Author

wilson-seok

Parents

2908f218

openvino c4ce6e56 - [GPU] Fix OOB memory access in gemm_tiled_opt kernel for non-aligned tile dimensions (#34482)

openvino
c4ce6e56 - [GPU] Fix OOB memory access in gemm_tiled_opt kernel for non-aligned tile dimensions (#34482)