[MLAS] Enable FP16 for Gelu (#26815)
Enabled fp16 Gelu for opset20.Gelu uses tanh and ERF functions depending
on the approximation method used. Implemented tanh in sve and erf in sve
and neon .
Gr3E results: with tanh and erf approximation:
<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 15">
<link id=Main-File rel=Main-File
href="file:///C:/Users/arunak/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
<link rel=File-List
href="file:///C:/Users/arunak/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
<!--table
{mso-displayed-decimal-separator:"\.";
mso-displayed-thousand-separator:"\,";}
@page
{margin:.75in .7in .75in .7in;
mso-header-margin:.3in;
mso-footer-margin:.3in;}
tr
{mso-height-source:auto;}
col
{mso-width-source:auto;}
br
{mso-data-placement:same-cell;}
td
{padding-top:1px;
padding-right:1px;
padding-left:1px;
mso-ignore:padding;
color:black;
font-size:11.0pt;
font-weight:400;
font-style:normal;
text-decoration:none;
font-family:"Aptos Narrow", sans-serif;
mso-font-charset:0;
mso-number-format:General;
text-align:general;
vertical-align:bottom;
border:none;
mso-background-source:auto;
mso-pattern:auto;
mso-protection:locked visible;
white-space:nowrap;
mso-rotate:0;}
.xl65
{text-align:center;
border:.5pt solid windowtext;
background:#ADADAD;
mso-pattern:black none;}
.xl66
{text-align:center;
border-top:.5pt solid windowtext;
border-right:.5pt solid windowtext;
border-bottom:1.0pt solid windowtext;
border-left:.5pt solid windowtext;
background:#ADADAD;
mso-pattern:black none;}
.xl67
{font-weight:700;
text-align:center;
border-top:none;
border-right:none;
border-bottom:.5pt solid windowtext;
border-left:1.0pt solid windowtext;
background:#F7C7AC;
mso-pattern:black none;}
.xl68
{font-weight:700;
text-align:center;
border-top:.5pt solid windowtext;
border-right:none;
border-bottom:.5pt solid windowtext;
border-left:1.0pt solid windowtext;
background:#F7C7AC;
mso-pattern:black none;}
.xl69
{font-weight:700;
text-align:center;
border-top:.5pt solid windowtext;
border-right:none;
border-bottom:1.0pt solid windowtext;
border-left:1.0pt solid windowtext;
background:#F7C7AC;
mso-pattern:black none;}
.xl70
{text-align:center;
border-top:.5pt solid windowtext;
border-right:.5pt solid windowtext;
border-bottom:.5pt solid windowtext;
border-left:1.0pt solid windowtext;
background:#ADADAD;
mso-pattern:black none;}
.xl71
{text-align:center;
border-top:.5pt solid windowtext;
border-right:.5pt solid windowtext;
border-bottom:1.0pt solid windowtext;
border-left:1.0pt solid windowtext;
background:#ADADAD;
mso-pattern:black none;}
.xl72
{font-weight:700;
text-align:center;
border-top:none;
border-right:.5pt solid windowtext;
border-bottom:none;
border-left:1.0pt solid windowtext;
background:#F7C7AC;
mso-pattern:black none;}
.xl73
{font-weight:700;
text-align:center;
border-top:none;
border-right:.5pt solid windowtext;
border-bottom:none;
border-left:.5pt solid windowtext;
background:#F7C7AC;
mso-pattern:black none;}
.xl74
{font-weight:700;
text-align:center;
border-top:none;
border-right:none;
border-bottom:1.0pt solid windowtext;
border-left:1.0pt solid windowtext;
background:#F7C7AC;
mso-pattern:black none;}
.xl75
{font-weight:700;
border:1.0pt solid windowtext;}
.xl76
{font-weight:700;
text-align:center;
border-top:1.0pt solid windowtext;
border-right:none;
border-bottom:1.0pt solid windowtext;
border-left:1.0pt solid windowtext;}
.xl77
{font-weight:700;
text-align:center;
border-top:1.0pt solid windowtext;
border-right:none;
border-bottom:1.0pt solid windowtext;
border-left:none;}
-->
</head>
<body link="#467886" vlink="#96607D">
GELU(ms) | Tanh_SVE | ERF_SVE | Tanh_NEON | ERF_NEON
-- | -- | -- | -- | --
Shape | F32 | F16 | F32 | F16 | F32 | F16 | F32 | F16
100 | 0.007 | 0.007 | 0.007 | 0.007 | 0.007 | 0.007 | 0.007 | 0.007
1000 | 0.008 | 0.007 | 0.012 | 0.008 | 0.008 | 0.008 | 0.012 | 0.008
1000000 | 0.076 | 0.039 | 0.203 | 0.07 | 0.066 | 0.043 | 0.151 | 0.064
</body>
</html>
Gr4 results: with tanh and erf approximation:
<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 15">
<link id=Main-File rel=Main-File
href="file:///C:/Users/arunak/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
<link rel=File-List
href="file:///C:/Users/arunak/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
<!--table
{mso-displayed-decimal-separator:"\.";
mso-displayed-thousand-separator:"\,";}
@page
{margin:.75in .7in .75in .7in;
mso-header-margin:.3in;
mso-footer-margin:.3in;}
tr
{mso-height-source:auto;}
col
{mso-width-source:auto;}
br
{mso-data-placement:same-cell;}
td
{padding-top:1px;
padding-right:1px;
padding-left:1px;
mso-ignore:padding;
color:black;
font-size:11.0pt;
font-weight:400;
font-style:normal;
text-decoration:none;
font-family:"Aptos Narrow", sans-serif;
mso-font-charset:0;
mso-number-format:General;
text-align:general;
vertical-align:bottom;
border:none;
mso-background-source:auto;
mso-pattern:auto;
mso-protection:locked visible;
white-space:nowrap;
mso-rotate:0;}
.xl65
{text-align:center;
border:.5pt solid windowtext;
background:#ADADAD;
mso-pattern:black none;}
.xl66
{text-align:center;
border-top:.5pt solid windowtext;
border-right:.5pt solid windowtext;
border-bottom:1.0pt solid windowtext;
border-left:.5pt solid windowtext;
background:#ADADAD;
mso-pattern:black none;}
.xl67
{font-weight:700;
text-align:center;
border-top:none;
border-right:none;
border-bottom:.5pt solid windowtext;
border-left:1.0pt solid windowtext;
background:#F7C7AC;
mso-pattern:black none;}
.xl68
{font-weight:700;
text-align:center;
border-top:.5pt solid windowtext;
border-right:none;
border-bottom:.5pt solid windowtext;
border-left:1.0pt solid windowtext;
background:#F7C7AC;
mso-pattern:black none;}
.xl69
{font-weight:700;
text-align:center;
border-top:.5pt solid windowtext;
border-right:none;
border-bottom:1.0pt solid windowtext;
border-left:1.0pt solid windowtext;
background:#F7C7AC;
mso-pattern:black none;}
.xl70
{text-align:center;
border-top:.5pt solid windowtext;
border-right:.5pt solid windowtext;
border-bottom:.5pt solid windowtext;
border-left:1.0pt solid windowtext;
background:#ADADAD;
mso-pattern:black none;}
.xl71
{text-align:center;
border-top:.5pt solid windowtext;
border-right:.5pt solid windowtext;
border-bottom:1.0pt solid windowtext;
border-left:1.0pt solid windowtext;
background:#ADADAD;
mso-pattern:black none;}
.xl72
{font-weight:700;
text-align:center;
border-top:none;
border-right:.5pt solid windowtext;
border-bottom:none;
border-left:1.0pt solid windowtext;
background:#F7C7AC;
mso-pattern:black none;}
.xl73
{font-weight:700;
text-align:center;
border-top:none;
border-right:.5pt solid windowtext;
border-bottom:none;
border-left:.5pt solid windowtext;
background:#F7C7AC;
mso-pattern:black none;}
.xl74
{font-weight:700;
text-align:center;
border-top:none;
border-right:none;
border-bottom:1.0pt solid windowtext;
border-left:1.0pt solid windowtext;
background:#F7C7AC;
mso-pattern:black none;}
.xl75
{font-weight:700;
border:1.0pt solid windowtext;}
.xl76
{font-weight:700;
text-align:center;
border-top:1.0pt solid windowtext;
border-right:none;
border-bottom:1.0pt solid windowtext;
border-left:1.0pt solid windowtext;}
.xl77
{font-weight:700;
text-align:center;
border-top:1.0pt solid windowtext;
border-right:none;
border-bottom:1.0pt solid windowtext;
border-left:none;}
-->
</head>
<body link="#467886" vlink="#96607D">
GELU(ms) | Tanh_SVE | ERF_SVE | Tanh_NEON | ERF_NEON
-- | -- | -- | -- | --
Shape | F32 | F16 | F32 | F16 | F32 | F16 | F32 | F16
100 | 0.005 | 0.005 | 0.005 | 0.005 | 0.005 | 0.005 | 0.005 | 0.005
1000 | 0.006 | 0.006 | 0.008 | 0.006 | 0.006 | 0.006 | 0.008 | 0.006
1000000 | 0.092 | 0.046 | 0.224 | 0.088 | 0.091 | 0.051 | 0.213 | 0.086
</body>
</html>
This PR is a joint contribution by:
Aruna K(@akote123)
Abhishek Jain(@abhijain1204fujitsu)
Sanket Kale(@sanketkaleoss )
---------
Co-authored-by: Sanket Kale <sanketk.kale@fujitsu.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>