onnxruntime
79e0676e - fix: out of bounds access for resize operation (#27419)

Commit
72 days ago
fix: out of bounds access for resize operation (#27419) ### Description This PR fixes: * An out-of-bounds write in CUDA Resize for LINEAR mode when running trilinear paths (3D/5D) * A race condition for the reduction kernel ### Root cause 1. The temporary dims-mapping buffer for LINEAR mode was sized using only H+W, while the trilinear coordinate mapping kernel writes D+H+W entries. 2. shared-memory race in the block-level reduction loop inside [reduction_functions.cu](vscode-file://vscode-app/c:/Users/lukas.folle/AppData/Local/Programs/Microsoft%20VS%20Code/072586267e/resources/app/out/vs/code/electron-browser/workbench/workbench.html). The condition allowed threads outside the active lower half to update shared memory in the same stride phase, creating overlapping read/write hazards My colleague @korbinian-mechlem-snkeos noticed this warning from compute-sanitzer > ========= Invalid __global__ write of size 4 bytes ========= at void onnxruntime::cuda::_ResizeTrilinearCoordinateMapping<float, onnxruntime::cuda::TransformCoordinate_HALF_PIXEL>(long long, long long, long long, long long, long long, long long, float, float, float, float, float, float, float, float, float, unsigned long long, bool, const T2 &, onnxruntime::cuda::LinearMappingInfo *)+0x400 ========= by thread (17,0,0) in block (2,0,0) ========= Address 0xb28fff7cc is out of bounds ========= and is 205 bytes after the nearest allocation at 0xb28fff400 of size 768 bytes ========= Saved host backtrace up to driver entry point at kernel launch time AND > ========= Warning: Race reported between Read access at void onnxruntime::cuda::detail::reduce_matrix_columns_kernel<float, float, float, onnxruntime::cuda::Identity, onnxruntime::cuda::Identity, (bool)0>(int, int, const T1 *, T2 *, T3 *, int *)+0xe80 ========= and Write access at void onnxruntime::cuda::detail::reduce_matrix_columns_kernel<float, float, float, onnxruntime::cuda::Identity, onnxruntime::cuda::Identity, (bool)0>(int, int, const T1 *, T2 *, T3 *, int *)+0xea0 [337920 hazards] ### Motivation and Context Update LINEAR buffer size calculation to: * use H+W for bilinear (2D/4D) * use D+H+W for trilinear (3D/5D) Prevents invalid global writes and intermittent CUDA memory errors in trilinear resize workloads. @johannes-rehm-snkeos
Parents
Loading