Mobile CPU allocator. (#36032)

Commit

4 years ago

Mobile CPU allocator. (#36032) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36032 QNNPACK AND XNNPACK may out-of-bound access the input and / or output tensors. This is by-design, and chosen to make the implementation of micro-kernels both simpler and faster as a result of not having to individually handle the corner cases where the number of processed elements is not a multiple of SIMD register width. This behavior will trigger ASAN though, and may result in a segfault if the accessed memory location just so happens to fall on a page the current process has no read access to. Here we define a custom allocator that allocates the extra storage required to keep this behavior safe. This allocator could have been restricted to QNNPACK and XNNPACK only, but that would have negative performance ramifications, as input tensors must now be reallocated, and copied over, if the tensor is not allocated with this allocator to begin with. Making this allocator the default on mobile builds minimizes the probability of unnecessary reallocations and copies, and also enables acceleration of operations where the output tensor is allocated outside of the function doing the implementation, wherein the implementation cannot simply re-allocate the output with the guarding allocator. Test Plan: Imported from OSS Differential Revision: D20970217 Pulled By: AshkanAliabadi fbshipit-source-id: 65cca2d38d7c0cef63c732f393016f50f1fa5199

References

gh/zasdfgbnm/39/base

Author

Ashkan Aliabadi

Committer

facebook-github-bot

Parents

ebfe631e

pytorch 006f1a32 - Mobile CPU allocator. (#36032)

Commit

pytorch
006f1a32 - Mobile CPU allocator. (#36032)