mul fusion pass (#64209)

Commit View On GitHub

Commit

3 years ago

[Static Runtime] Add sign/abs/lop1p/mul fusion pass (#64209) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64209 Add a new fusion pass that turns transforms the following pattern: ``` graph(%input): %0 : Tensor = aten::sign(%input) %1 : Tensor = aten::abs(%input) %2 : Tensor = aten::log1p(%1) %res : Tensor = aten::mul(%0, %2) return (%res) ``` Into a single op: ``` graph(%input): %res : Tensor = static_runtim::signed_log1p(%input) return (%res) ``` The intent is to reduce the number of passes over the tensor. However, enabling this pass actually causes a performance regression, probably due to a lack of vectorization in the fused implementation. Because of this issue, this diff **does not** enable this pass. Followup: navahgar will add an NNC kernel which is faster than the the unfused version and enable this pass. We still need this version as a fallback since the NNC kernel will not support all dtypes. Test Plan: `buck test caffe2/benchmarks/static_runtime:static_runtime_cpptest -- SignedLog1p` Test passed with new graph pass disabled and enabled. Reviewed By: hlu1 Differential Revision: D30559929 fbshipit-source-id: e4e080cb2e6a705cfdde1fc98bee92b723f8132a

References

#65112 - [LTC] Merge master

Author

Mike Iovine

Committer

facebook-github-bot

Parents

cd3be467

pytorch 616fd921 - [Static Runtime] Add sign/abs/lop1p/mul fusion pass (#64209)

Commit

pytorch
616fd921 - [Static Runtime] Add sign/abs/lop1p/mul fusion pass (#64209)