onnxruntime
a02f7699 - [MLAS] Enable FP16 for Gelu (#26815)

Commit
39 days ago
[MLAS] Enable FP16 for Gelu (#26815) Enabled fp16 Gelu for opset20.Gelu uses tanh and ERF functions depending on the approximation method used. Implemented tanh in sve and erf in sve and neon . Gr3E results: with tanh and erf approximation: <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40"> <head> <meta name=ProgId content=Excel.Sheet> <meta name=Generator content="Microsoft Excel 15"> <link id=Main-File rel=Main-File href="file:///C:/Users/arunak/AppData/Local/Temp/msohtmlclip1/01/clip.htm"> <link rel=File-List href="file:///C:/Users/arunak/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml"> <!--table {mso-displayed-decimal-separator:"\."; mso-displayed-thousand-separator:"\,";} @page {margin:.75in .7in .75in .7in; mso-header-margin:.3in; mso-footer-margin:.3in;} tr {mso-height-source:auto;} col {mso-width-source:auto;} br {mso-data-placement:same-cell;} td {padding-top:1px; padding-right:1px; padding-left:1px; mso-ignore:padding; color:black; font-size:11.0pt; font-weight:400; font-style:normal; text-decoration:none; font-family:"Aptos Narrow", sans-serif; mso-font-charset:0; mso-number-format:General; text-align:general; vertical-align:bottom; border:none; mso-background-source:auto; mso-pattern:auto; mso-protection:locked visible; white-space:nowrap; mso-rotate:0;} .xl65 {text-align:center; border:.5pt solid windowtext; background:#ADADAD; mso-pattern:black none;} .xl66 {text-align:center; border-top:.5pt solid windowtext; border-right:.5pt solid windowtext; border-bottom:1.0pt solid windowtext; border-left:.5pt solid windowtext; background:#ADADAD; mso-pattern:black none;} .xl67 {font-weight:700; text-align:center; border-top:none; border-right:none; border-bottom:.5pt solid windowtext; border-left:1.0pt solid windowtext; background:#F7C7AC; mso-pattern:black none;} .xl68 {font-weight:700; text-align:center; border-top:.5pt solid windowtext; border-right:none; border-bottom:.5pt solid windowtext; border-left:1.0pt solid windowtext; background:#F7C7AC; mso-pattern:black none;} .xl69 {font-weight:700; text-align:center; border-top:.5pt solid windowtext; border-right:none; border-bottom:1.0pt solid windowtext; border-left:1.0pt solid windowtext; background:#F7C7AC; mso-pattern:black none;} .xl70 {text-align:center; border-top:.5pt solid windowtext; border-right:.5pt solid windowtext; border-bottom:.5pt solid windowtext; border-left:1.0pt solid windowtext; background:#ADADAD; mso-pattern:black none;} .xl71 {text-align:center; border-top:.5pt solid windowtext; border-right:.5pt solid windowtext; border-bottom:1.0pt solid windowtext; border-left:1.0pt solid windowtext; background:#ADADAD; mso-pattern:black none;} .xl72 {font-weight:700; text-align:center; border-top:none; border-right:.5pt solid windowtext; border-bottom:none; border-left:1.0pt solid windowtext; background:#F7C7AC; mso-pattern:black none;} .xl73 {font-weight:700; text-align:center; border-top:none; border-right:.5pt solid windowtext; border-bottom:none; border-left:.5pt solid windowtext; background:#F7C7AC; mso-pattern:black none;} .xl74 {font-weight:700; text-align:center; border-top:none; border-right:none; border-bottom:1.0pt solid windowtext; border-left:1.0pt solid windowtext; background:#F7C7AC; mso-pattern:black none;} .xl75 {font-weight:700; border:1.0pt solid windowtext;} .xl76 {font-weight:700; text-align:center; border-top:1.0pt solid windowtext; border-right:none; border-bottom:1.0pt solid windowtext; border-left:1.0pt solid windowtext;} .xl77 {font-weight:700; text-align:center; border-top:1.0pt solid windowtext; border-right:none; border-bottom:1.0pt solid windowtext; border-left:none;} --> </head> <body link="#467886" vlink="#96607D"> GELU(ms) | Tanh_SVE | ERF_SVE | Tanh_NEON | ERF_NEON -- | -- | -- | -- | -- Shape | F32 | F16 | F32 | F16 | F32 | F16 | F32 | F16 100 | 0.007 | 0.007 | 0.007 | 0.007 | 0.007 | 0.007 | 0.007 | 0.007 1000 | 0.008 | 0.007 | 0.012 | 0.008 | 0.008 | 0.008 | 0.012 | 0.008 1000000 | 0.076 | 0.039 | 0.203 | 0.07 | 0.066 | 0.043 | 0.151 | 0.064 </body> </html> Gr4 results: with tanh and erf approximation: <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40"> <head> <meta name=ProgId content=Excel.Sheet> <meta name=Generator content="Microsoft Excel 15"> <link id=Main-File rel=Main-File href="file:///C:/Users/arunak/AppData/Local/Temp/msohtmlclip1/01/clip.htm"> <link rel=File-List href="file:///C:/Users/arunak/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml"> <!--table {mso-displayed-decimal-separator:"\."; mso-displayed-thousand-separator:"\,";} @page {margin:.75in .7in .75in .7in; mso-header-margin:.3in; mso-footer-margin:.3in;} tr {mso-height-source:auto;} col {mso-width-source:auto;} br {mso-data-placement:same-cell;} td {padding-top:1px; padding-right:1px; padding-left:1px; mso-ignore:padding; color:black; font-size:11.0pt; font-weight:400; font-style:normal; text-decoration:none; font-family:"Aptos Narrow", sans-serif; mso-font-charset:0; mso-number-format:General; text-align:general; vertical-align:bottom; border:none; mso-background-source:auto; mso-pattern:auto; mso-protection:locked visible; white-space:nowrap; mso-rotate:0;} .xl65 {text-align:center; border:.5pt solid windowtext; background:#ADADAD; mso-pattern:black none;} .xl66 {text-align:center; border-top:.5pt solid windowtext; border-right:.5pt solid windowtext; border-bottom:1.0pt solid windowtext; border-left:.5pt solid windowtext; background:#ADADAD; mso-pattern:black none;} .xl67 {font-weight:700; text-align:center; border-top:none; border-right:none; border-bottom:.5pt solid windowtext; border-left:1.0pt solid windowtext; background:#F7C7AC; mso-pattern:black none;} .xl68 {font-weight:700; text-align:center; border-top:.5pt solid windowtext; border-right:none; border-bottom:.5pt solid windowtext; border-left:1.0pt solid windowtext; background:#F7C7AC; mso-pattern:black none;} .xl69 {font-weight:700; text-align:center; border-top:.5pt solid windowtext; border-right:none; border-bottom:1.0pt solid windowtext; border-left:1.0pt solid windowtext; background:#F7C7AC; mso-pattern:black none;} .xl70 {text-align:center; border-top:.5pt solid windowtext; border-right:.5pt solid windowtext; border-bottom:.5pt solid windowtext; border-left:1.0pt solid windowtext; background:#ADADAD; mso-pattern:black none;} .xl71 {text-align:center; border-top:.5pt solid windowtext; border-right:.5pt solid windowtext; border-bottom:1.0pt solid windowtext; border-left:1.0pt solid windowtext; background:#ADADAD; mso-pattern:black none;} .xl72 {font-weight:700; text-align:center; border-top:none; border-right:.5pt solid windowtext; border-bottom:none; border-left:1.0pt solid windowtext; background:#F7C7AC; mso-pattern:black none;} .xl73 {font-weight:700; text-align:center; border-top:none; border-right:.5pt solid windowtext; border-bottom:none; border-left:.5pt solid windowtext; background:#F7C7AC; mso-pattern:black none;} .xl74 {font-weight:700; text-align:center; border-top:none; border-right:none; border-bottom:1.0pt solid windowtext; border-left:1.0pt solid windowtext; background:#F7C7AC; mso-pattern:black none;} .xl75 {font-weight:700; border:1.0pt solid windowtext;} .xl76 {font-weight:700; text-align:center; border-top:1.0pt solid windowtext; border-right:none; border-bottom:1.0pt solid windowtext; border-left:1.0pt solid windowtext;} .xl77 {font-weight:700; text-align:center; border-top:1.0pt solid windowtext; border-right:none; border-bottom:1.0pt solid windowtext; border-left:none;} --> </head> <body link="#467886" vlink="#96607D"> GELU(ms) | Tanh_SVE | ERF_SVE | Tanh_NEON | ERF_NEON -- | -- | -- | -- | -- Shape | F32 | F16 | F32 | F16 | F32 | F16 | F32 | F16 100 | 0.005 | 0.005 | 0.005 | 0.005 | 0.005 | 0.005 | 0.005 | 0.005 1000 | 0.006 | 0.006 | 0.008 | 0.006 | 0.006 | 0.006 | 0.008 | 0.006 1000000 | 0.092 | 0.046 | 0.224 | 0.088 | 0.091 | 0.051 | 0.213 | 0.086 </body> </html> This PR is a joint contribution by: Aruna K(@akote123) Abhishek Jain(@abhijain1204fujitsu) Sanket Kale(@sanketkaleoss ) --------- Co-authored-by: Sanket Kale <sanketk.kale@fujitsu.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Author
Parents
Loading