onnxruntime
Using vectorized loads (float2) for fp16 to improve performance
#11390
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
6
Changes
View On
GitHub
Using vectorized loads (float2) for fp16 to improve performance
#11390
hariharans29
merged 6 commits into
microsoft:master
from
ROCm:hubertlu/fastgelu
Using vectorized loads (float2) for fp16 to improve performance
664bb50e
Fix a few warnings from cpplint
68904e56
Fix a few warnings from cpplint
5dc6cb5c
hariharans29
commented on 2022-04-29
hariharans29
commented on 2022-04-29
Use __float2half2_rn and fix some cpplint warnings
64821fb1
tianleiwu
commented on 2022-05-03
Move some computaions to LaunchFastGeluKernel
4e998546
Fix some Lint C++ warning
e8c19264
tianleiwu
approved these changes on 2022-05-05
hariharans29
merged
2a90922f
into master
3 years ago
Login to write a write a comment.
Login via GitHub
Reviewers
tianleiwu
hariharans29
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub