Expand vectorized UTF16 transcoding to non-ASCII BMP code points #88342
WIP vectorization for UTF16->UTF8
6ae47a3f
Lots of fixes
246939c5
Fun fact: UInt16 is not the same size as UInt8
f85efe5a
See if the scalar version autovectorizes on arm64 too
4b84ced2
Build fix for experiment
25ac9705
Remove arm64-specific code
931ae62e
Adjust for 32 bit
8e9f5e02
Stop doing size math, stop duplicating work in some cases, and delete…
f0cee253
Adopt the new implementation in another place, add unsafe annotations…
4b9be8fe
Actually detect non-ascii in the fallback path
f326f61c
Remove pointless failed attempt at being clever
9263ce67
Do it all by hand, since empirically it's a lot faster for runs of no…
31779570
Merge branch 'main' into asciivec
9d6d2258
Add a (slow) scalar fallback path, and add more unsafe annotations
9dc0c968
Fix precondition
bb2437d8
Merge origin/main into asciivec
b67b9925
Speed up length calculation
0aa24817
Refactor a bit
468fb52a
Be less silly about calculating isASCII
755150d6
Some fixes for invalid content, plus slightly reducing the bias towar…
338997a4
Remove redundant checks
4ada2644
unsafe annotations, fast paths, and removing a little dead code
0433c916
Fix isASCII calculation
82973ed8
Expand SIMD length coverage to all non-surrogates
199ce793
Handle non-surrogates in transcoding too
3ce7eb6d
Somehow this fix went missing??
7cbe56a1
Revert "Handle non-surrogates in transcoding too"
c2b223ee
Revert "Expand SIMD length coverage to all non-surrogates"
3f4326c4
Merge remote-tracking branch 'origin/main' into asciivec
9c55321b
Merge remote-tracking branch 'origin/main' into asciivec
1a43eaa5
Fix a bug when surrogate pairs straddle SIMD block boundaries
01df4588
tweak loop condition to avoid potential UB
43bb42fe
Avoid falling into the scalar path unnecessarily when the buffer size…
f35094db
Misc cleanup and safety against future changes
34752f7b
Whoops forgot to add a file
38d8ff9c
Reapply "Expand SIMD length coverage to all non-surrogates"
bf428143
Reapply "Handle non-surrogates in transcoding too"
0b274711
Try vectorizing non-ASCII BMP, this time with somewhat less bad codegen
164583e4
Tweak loop to be more like the original
92c35567
Use the update ascii check in the other function too
d2782a48
Vectorize UTF8->UTF16 copying some too
2ed9783f
Avoid shadowing a function name with a variable name
d3f47e34
Avoid emitting symbols where we didn't before
0f002ed6
Catfish-Man
force pushed
from
0ef3ee0e
to
0f002ed6
13 days ago
Typo fix
90ec8bd2
Merge branch 'asciivec' into asciivec-3
57ac5b97
Fix typo harder
1c3937fa
Fix a subtle bug and add a test for it
03cf34d2
Merge branch 'asciivec' into asciivec-3
548a2f3e
Login to write a write a comment.
Login via GitHub