pytorch
0a9764ec - [nnc] Expose vectorized math functions to jit fuser. (#51190)

Commit
3 years ago
[nnc] Expose vectorized math functions to jit fuser. (#51190) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51190 We want to be able to call fast vectorized functions from sleef inside the jit fuser, but only when they're supported by the host processor. Enabling this feature has two parts: 1. Record the addresses of the symbols, assuming they're defined. Sleef only defines vectorized functions if AVX is enabled, so we need to define __AVX__ to get access to those symbols. We don't actually need to compile anything with AVX; the symbols just have to be present. 2. Before emitting a call to sleef, check if the host processor actually has AVX. LLVM makes this easy since we can just check the target feature string for "+avx". ghstack-source-id: 120614086 Test Plan: ``` buck run mode -c python.package_style=inplace //caffe2/benchmarks/cpp/tensorexpr:bench_ops ``` shows a significant speedup on most math functions (esp sigmoid, which goes from 13% of ATen speed to parity). Reviewed By: navahgar Differential Revision: D26096170 fbshipit-source-id: b7268a50d73f8dc03b4db11cc38b8402387eed2d
Author
Parents
Loading