pytorch
301be851 - Fix grid_sample out of boundary when grid contains large numbers (#35506)

Commit View On GitHub

Commit

4 years ago

Fix grid_sample out of boundary when grid contains large numbers (#35506) Summary: This PR would fix https://github.com/pytorch/pytorch/issues/35202, fix GPU part of https://github.com/pytorch/pytorch/issues/24823, be related to https://github.com/pytorch/pytorch/issues/24870. Here is the origin of this problem. 1. Like those in https://github.com/pytorch/pytorch/issues/35202, with large numbers in grid like `grid.min() == -10059144 grid.max()==67680944`; or `nan, inf, 1.0E20` in https://github.com/pytorch/pytorch/issues/24823, https://github.com/pytorch/pytorch/blob/4d39aeec271fde5a89aa68c7588023205c5ca8a9/aten/src/ATen/native/cuda/GridSampler.cu#L309-L321 `ix, iy` will be unnormalized to very large numbers, exceed the bound of INT_MAX. Then, those `ix_nw, iy_nw` variables will be cast to INT_MAX, and some other variables with "+1" will be INT_MIN. 2. However, these INT_MAX, INT_MIN should not big problems, because https://github.com/pytorch/pytorch/blob/4d39aeec271fde5a89aa68c7588023205c5ca8a9/aten/src/ATen/native/cuda/GridSampler.cu#L358-L362 https://github.com/pytorch/pytorch/blob/4d39aeec271fde5a89aa68c7588023205c5ca8a9/aten/src/ATen/native/cuda/GridSampler.cuh#L202-L205 these `within_bounds_2d` functions are supposed to guard the if-statement, prevent the illegal memory access, and leave those output values as zero (padding_modes='zeros'). 3. Now here comes the problem, `within_bounds_2d` is set to "inline". We found that those `+1` statement and `>=0` statement may cause compiler to "optimize" the code, that is: ```cpp int B = something; int a = something; int b = a + 1; bool r = (b >= 0 && b < B); ``` will be compiled into assembly code like ```cpp int B = something; int a = something; bool r1 = (a > -2) int b = a + 1; bool r2 = (b < B); bool r = r1 && r2; ``` This looks nice, but when a = INT_MAX, `a+1` causes Undefined Behavior. Typically, we get b = INT_MIN, then the boolean result from compiled assembly will be true. The `within_bounds_2d` no longer guards us from the illegal memory access. 4. There could be different ways to fix this bug. For example, we may set all of the "ix_nw, iy_nw" values to `int64_t`. That would be a potential performance issue, and doesn't prevent those examples in https://github.com/pytorch/pytorch/issues/24823 with 1E20 in grid. One minimal fix that I found is to restrict `within_bounds_2d` from being inlined. Thus, compiler won't optimize those `a+1` and `a>=0` code together. I did a short performace test, just to make sure this forced noinline solution won't cause regression. The performance script can be found at https://github.com/xwang233/code-snippet/blob/a6f8bce52222cd1c5270e22a87a4699b65741686/grid-sample/grid-sample.ipynb. For this `__attribute__((noinline))` macro, I have tested that on nvcc, and there was no problem. I'm not sure if that also works on clang. cc csarofeen ptrblck ngimel bnehoran zasdfgbnm SsnL Pull Request resolved: https://github.com/pytorch/pytorch/pull/35506 Differential Revision: D20799304 Pulled By: ngimel fbshipit-source-id: fc70289b35039fad954908a990ab0a2f16fbfcb2

Author

xwang233

Committer

facebook-github-bot

Parents

16774f73

pytorch 301be851 - Fix grid_sample out of boundary when grid contains large numbers (#35506)

Commit

pytorch
301be851 - Fix grid_sample out of boundary when grid contains large numbers (#35506)