[Inductor] add missing ops for cpp vectorization overrides (#90750)
For micro-benchmark, aten.elu.default and aten.elu_backward.default have poor performance with inductor compared to eager. The main reason is lack of the vectorization. With adding missing ops for cpp vectorization overrides, the vectorization could be successfully applied.
Performance data for eager v.s. inductor:
<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 15">
<link id=Main-File rel=Main-File
href="file:///C:/Users/xuanliao/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
<link rel=File-List
href="file:///C:/Users/xuanliao/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
<!--table
{mso-displayed-decimal-separator:"\.";
mso-displayed-thousand-separator:"\,";}
@page
{margin:.75in .7in .75in .7in;
mso-header-margin:.3in;
mso-footer-margin:.3in;}
tr
{mso-height-source:auto;}
col
{mso-width-source:auto;}
br
{mso-data-placement:same-cell;}
td
{padding-top:1px;
padding-right:1px;
padding-left:1px;
mso-ignore:padding;
color:black;
font-size:11.0pt;
font-weight:400;
font-style:normal;
text-decoration:none;
font-family:Calibri, sans-serif;
mso-font-charset:0;
mso-number-format:General;
text-align:general;
vertical-align:bottom;
border:none;
mso-background-source:auto;
mso-pattern:auto;
mso-protection:locked visible;
white-space:nowrap;
mso-rotate:0;}
.xl63
{mso-number-format:Percent;}
.xl64
{color:gray;}
-->
</head>
<body link="#0563C1" vlink="#954F72">
op | speedup_old | RSD (3) | speedup_new | RSD (3) | increased_performance
-- | -- | -- | -- | -- | --
aten.elu.default | 0.205947276 | 1.73% | 0.995302802 | 4.76% | 383.28%
aten.elu_backward.default | 0.336280639 | 0.58% | 1.69473642 | 1.96% | 403.96%
</body>
</html>
The new supported ops for cpp vectorization overrides:
- eq
- ne
- lt
- gt
- le
- ge
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90750
Approved by: https://github.com/jgong5, https://github.com/EikanWang, https://github.com/jansel, https://github.com/desertfire