qlinaradd for arm/sse2/avx2 using intrinsic, enable binary broadcasting parallel #4216
Support quantization linear binary element wise math ops, implement Q…
eee48667
Modify according to PR feedbacks. Mainly:
faabbe85
Utilize MlasSubtractInt32x4 in MlasDequantizeLinearVector().
ba681203
Some format fix.
6b0a3ff8
More nature parallel parameter type.
0b338cc6
Fix build break for x86.
90f57521
Comment goes to 80 before wrap.
1678c03a
Many change on assembly on Marco related.
e1f277a8
Using CLang Format to format the file.
6922d0e0
Fix arm32 build error.
7e709e90
Remove some duplicate in different #if defined
1058e12a
working add.u8.vector to vector
a74f8505
Fix runtime bus error on real arm32 linux.
39af7d12
fix typo in store last one lane.
ed353171
arm32 qlinearadd handle scalar.
adaec558
Move qladd to seperate c++ file
49cbf02f
Add neon64 qladd.
08e1e5d1
refactor some, enhance two instructions on arm64 only instructions
4cf229d2
Fix typo for arm64
8288a432
use strict op in pure c++ (min/max on float value)
0fbc3f4d
sse2 new version.
3ccd899e
mrege arm/sse2/avx2
ba664cfc
pass arm/sse/avx2 linux test
9a949b5b
remove non-used assembly file.
2487f07c
Remove unused data definition and tailing spaces.
8251f96a
Fix broadcasting parallel issue.
5c5aadb2
Enhance broadcasting scenarios. Allow testing result diff due to round
71a09b85
Add Mlas or MLAS_ prefix for namespace safety.
72a6fc3d
Handle alignment issue for arm32 for GCC/MSVC. remove some unused
860964f3
Specify /arch:AVX2 for qladd_avx2.cpp
0cec23eb
Fix type during copy/paste when unrolling. Better one GreatEqual
4a9d1ec6
Arm neon alignment parameter is bits rather than bytes, change it.
c0df4a87
Move qladd_avx2.cpp to intrinsics/avx2/ folder
983a8d69
Formatting using mlas style.
cf0adbdd
Double check mlas style for these files.
2c12538b
change indent 2 to 4 for qladd_avx2.cpp
d0406502
Fix windows x86 build error due to sse2 no _mm_cvtsi128_si64
14070e81
To re-trigger all as old failed pipeline updated.
2fac863a
tracysh
approved these changes
on 2020-07-01
zhanghuanrong
deleted the zhalei/qladd_parallel_arm branch 5 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub