pytorch
aa0ca994 - [Inductor] add missing ops for cpp vectorization overrides (#90750)

Commit
1 year ago
[Inductor] add missing ops for cpp vectorization overrides (#90750) For micro-benchmark, aten.elu.default and aten.elu_backward.default have poor performance with inductor compared to eager. The main reason is lack of the vectorization. With adding missing ops for cpp vectorization overrides, the vectorization could be successfully applied. Performance data for eager v.s. inductor: <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40"> <head> <meta name=ProgId content=Excel.Sheet> <meta name=Generator content="Microsoft Excel 15"> <link id=Main-File rel=Main-File href="file:///C:/Users/xuanliao/AppData/Local/Temp/msohtmlclip1/01/clip.htm"> <link rel=File-List href="file:///C:/Users/xuanliao/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml"> <!--table {mso-displayed-decimal-separator:"\."; mso-displayed-thousand-separator:"\,";} @page {margin:.75in .7in .75in .7in; mso-header-margin:.3in; mso-footer-margin:.3in;} tr {mso-height-source:auto;} col {mso-width-source:auto;} br {mso-data-placement:same-cell;} td {padding-top:1px; padding-right:1px; padding-left:1px; mso-ignore:padding; color:black; font-size:11.0pt; font-weight:400; font-style:normal; text-decoration:none; font-family:Calibri, sans-serif; mso-font-charset:0; mso-number-format:General; text-align:general; vertical-align:bottom; border:none; mso-background-source:auto; mso-pattern:auto; mso-protection:locked visible; white-space:nowrap; mso-rotate:0;} .xl63 {mso-number-format:Percent;} .xl64 {color:gray;} --> </head> <body link="#0563C1" vlink="#954F72"> op | speedup_old | RSD (3) | speedup_new | RSD (3) | increased_performance -- | -- | -- | -- | -- | -- aten.elu.default | 0.205947276 | 1.73% | 0.995302802 | 4.76% | 383.28% aten.elu_backward.default | 0.336280639 | 0.58% | 1.69473642 | 1.96% | 403.96% </body> </html> The new supported ops for cpp vectorization overrides: - eq - ne - lt - gt - le - ge Pull Request resolved: https://github.com/pytorch/pytorch/pull/90750 Approved by: https://github.com/jgong5, https://github.com/EikanWang, https://github.com/jansel, https://github.com/desertfire
Author
Committer
Parents
Loading