NEON kernels for NCHWc Convolution and Pooling #25580
Rewire ORT to support a NEON version of NCHWc Conv
cc30a146
Remove reference to assembly file
f190c0d7
Add a NEON kernel for Pointwise Convolution
632870bb
Add a NEON kernel for Depthwise
159570a9
Remove placeholder implementations
52f09bf4
Add placeholder kernel for MlasConvNchwcFloatKernelNeon
b505bd64
Fix MlasConvNchwcFloatKernelNeon
790cc7ed
Use MLAS intrinsics for MlasConvNchwcFloatKernelNeon
906393af
Add MlasConvNchwFloatKernelNeon
4d322e64
Add placeholder NCHWc Pool
cb06a1a4
Vanilla C++ implementation
00caa4c1
Intrinsics for Pooling
4cead5ec
Refactored to share code
abd54916
Format file & delete unused header
74e0e3b0
Minor modifications to pass more tests
16be947d
Remove unnecessary code & formatting changes
f7d971d3
Refactor to share some code
0ff394cd
Change block size to 16
bd2b6c44
Update pooling algorithm for block size 16
2b783776
Remove comment
ee9b9431
Add correct header and refactor kernels to share code.
23425e8e
Address Copilot comments
7000e9fe
Extend kernels to Windows & Apple
c5c3f051
Merge remote-tracking branch 'upstream/main' into nchwc_conv_pool
619d87c2
Hardcode BlockSize to 16 and add it to the header.
506bf053
Increase android build size to 10% higher than the CI-reported size o…
fb5fb504
Centralize MLAS_NEON_NCHWC_BLOCK_SIZE
fb99f7dc
Merge branch 'microsoft:main' into nchwc_conv_pool
aa21aca3
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub