nvda
[WIP] Merge Chinese Word Segmentation work
#19166
Open

[WIP] Merge Chinese Word Segmentation work #19166

seanbudd wants to merge 125 commits into master from try-chineseWordSegmentation-staging
seanbudd
CrazySteve0605 Introduce cppjieba as a submodule for Chinese word segmentation
5cb5189d
CrazySteve0605 Update what's new
fb4efef4
CrazySteve0605 Add comments for building script of cppjieba and its dependency
ae58e9b9
CrazySteve0605 Update projectDocs/dev/createDevEnvironment.md
06070c12
CrazySteve0605 Update include/readme.md
2273a604
CrazySteve0605 Remove changes in sconscript for localLIb
3d4d9f11
CrazySteve0605 add building script for cppjieba
1fbf05f4
CrazySteve0605 add JiebaSingleton wrapper and C API for NVDA segmentation
7de7464e
CrazySteve0605 Merge branch 'master' into integrateCPPJieba
f4cab8af
CrazySteve0605 Merge branch 'master' into integrateCPPJieba
d4c3a926
CrazySteve0605 Update GitHub action workflow to fetch cppjieba's submodule
0d92c086
CrazySteve0605 Update .gitignore for cppjieba
da662bed
CrazySteve0605 Update building and setup script for cppjieba's dicts installation
38a12dcf
CrazySteve0605 update copyright headers based on @seanbudd's suggestions
c60c2da8
CrazySteve0605 Update include/readme.md
c853b640
CrazySteve0605 Merge branch 'master' into integrateCPPJieba
53dd3bb5
CrazySteve0605 Merge branch 'master' into integrateCPPJieba
eba63ab8
CrazySteve0605 add `WordSegment` module
b0ac0819
CrazySteve0605 update `textUtils/__init__.py`
9f62f04d
CrazySteve0605 update `textInfos/offsets.py`
81f20404
CrazySteve0605 update `displayModel.py`
da64cd88
pre-commit-ci[bot] Pre-commit auto-fix
557f4043
CrazySteve0605 update type annotations
f72d3488
CrazySteve0605 Merge branch 'master' into integrateCPPJieba
19cad8a3
CrazySteve0605 add wrapper for word manager
adc22fb9
CrazySteve0605 update the word segmentation structure
4adac073
CrazySteve0605 Merge branch 'integrateCPPJieba' into wordNavigationForChineseText
407d4b27
pre-commit-ci[bot] Pre-commit auto-fix
0d40f0a1
CrazySteve0605 add copyright header
676fc42f
CrazySteve0605 add type annotations
ddd48e86
CrazySteve0605 update log
3c65868c
CrazySteve0605 add trailing commas in multi-line constructs
d69e8b7c
CrazySteve0605 make wordSegment module to make file structure clearer
8244a76e
CrazySteve0605 add initialization logic to wordSeg module
3f54d62b
pre-commit-ci[bot] Pre-commit auto-fix
38b4bea4
CrazySteve0605 use multithreading for cppjieba's initialization
eeb96aa7
CrazySteve0605 add configuration for word navigation
3ba56f0a
pre-commit-ci[bot] Pre-commit auto-fix
356c11c0
CrazySteve0605 make "Auto" the default option for word navigation
4a680ea5
CrazySteve0605 update for pyright checks
97b6db7a
CrazySteve0605 Merge branch 'master' into integrateCPPJieba
a643391d
CrazySteve0605 add wrappers for user dict management
3e495d2c
CrazySteve0605 Merge branch 'integrateCPPJieba' into wordNavigationForChineseText
a4edc9e2
CrazySteve0605 resolve deprecation
3b2d8353
pre-commit-ci[bot] Pre-commit auto-fix
9e6a2e19
CrazySteve0605 Merge branch 'master' into integrateCPPJieba
11827fb2
CrazySteve0605 simplify the initialization of cppjieba
1869ed00
CrazySteve0605 Merge branch 'integrateCPPJieba' into wordNavigationForChineseText
c1fb4b85
CrazySteve0605 add `segmentedText` method
a1113d80
CrazySteve0605 add word separator to optimize braille output for Chinese text
2d7c5968
CrazySteve0605 Merge branch 'master' into integrateCPPJieba
fe118eec
CrazySteve0605 update cppjieba to the latest commit
49cc1fe9
CrazySteve0605 update .gitattributes for .hpp header files
a9281f62
CrazySteve0605 simplify helper of `cppjieba`
2955ca8a
CrazySteve0605 Merge branch 'integrateCPPJieba' into wordNavigationForChineseText
abeb1475
CrazySteve0605 update `wordSegStrategy.py`
b848e1ba
CrazySteve0605 update module importing order and type annotations
3bfbe59d
CrazySteve0605 update changelog
53158b6b
CrazySteve0605 update building script
30120f8a
CrazySteve0605 revert installing script
00796fef
CrazySteve0605 fix building script
09b18901
CrazySteve0605 Merge branch 'master' into integrateCPPJieba
bec5dc5d
CrazySteve0605 update helper of `coojieba`
b3e08ee8
CrazySteve0605 Merge branch 'integrateCPPJieba' into wordNavigationForChineseText
f5087cc7
CrazySteve0605 update `wordSegStrategy.py`
3a0badc7
pre-commit-ci[bot] Pre-commit auto-fix
cf3e1150
CrazySteve0605 Merge branch 'master' into integrateCPPJieba
984b6eb2
CrazySteve0605 Merge branch 'integrateCPPJieba' into wordNavigationForChineseText
2b1d4b33
CrazySteve0605 handle punctuation spacing
97eb6dd4
pre-commit-ci[bot] Pre-commit auto-fix
bac32106
CrazySteve0605 Merge branch 'master' into integrateCPPJieba
0f507d51
CrazySteve0605 Revert "Update projectDocs/dev/createDevEnvironment.md"
c2cbb240
CrazySteve0605 avoid using compilation time path
194a69ea
CrazySteve0605 Update .gitattributes
2e730d6a
CrazySteve0605 Revert "update module importing order and type annotations"
a8955a3b
CrazySteve0605 Merge branch 'integrateCPPJieba' into wordNavigationForChineseText
7ee08d0f
CrazySteve0605 update `wordSegStrategy.py`
90660ba3
CrazySteve0605 revert copyright header of `configSpec.py`
9537999a
CrazySteve0605 Update source/core.py
dc23346c
michaelDCurran Merge branch 'try-chineseWordSegmentation-staging' into integrateCPPJ…
30e855f7
CrazySteve0605 Merge branch 'master' into integrateCPPJieba
5562e70b
CrazySteve0605 Merge branch 'integrateCPPJieba' into wordNavigationForChineseText
38ec7ffb
CrazySteve0605 correct method naming
ccf07f9f
CrazySteve0605 update UI text for Uniscribe
250e7007
CrazySteve0605 make `cppjieba` only available when NVDA's language is set to Chinese
53b38706
CrazySteve0605 Merge branch 'master' into integrateCPPJieba
fec70a9f
CrazySteve0605 Merge branch 'integrateCPPJieba' into wordNavigationForChineseText
69617c49
CrazySteve0605 update `wordSegSegmenter.py` to handle offsets at the end of the string
111a24d1
CrazySteve0605 make initialization of word segmenters conditional on language
43bfe036
CrazySteve0605 add unittest cases for `WordSegmenter`
2eec029e
pre-commit-ci[bot] Pre-commit auto-fix
f7694572
CrazySteve0605 fixup
9479029f
CrazySteve0605 extract punctuation from `wordSegStrategy.py` to `wordSegUtils.py`
9834b686
CrazySteve0605 fix up
b69d466f
CrazySteve0605 update changelog
6f586fd0
CrazySteve0605 Merge branch 'wordNavigationForChineseText' into brailleOutputForChinese
9304a39a
CrazySteve0605 correct and simplify the offset calculations
3b7bf5fa
CrazySteve0605 update changelog
251811e4
michaelDCurran Merge branch 'try-chineseWordSegmentation-staging' into integrateCPPJ…
b327e239
michaelDCurran Merge branch 'integrateCppJieba' into try-chineseWordSegmentation-sta…
093a825f
michaelDCurran Merge branch 'try-chineseWordSegmentation-staging' into wordNavigatio…
face4bd5
CrazySteve0605 revert `Initialize Word Segmenters for Unused Languages:` checkbox an…
b40d7095
pre-commit-ci[bot] Pre-commit auto-fix
653e8087
CrazySteve0605 fixup unittests
552b42bd
CrazySteve0605 simplify the logic for 'Auto' option in Word Segmentation Standard se…
5e0e3fdb
pre-commit-ci[bot] Pre-commit auto-fix
c3a85623
CrazySteve0605 fixup
0940a73f
CrazySteve0605 Merge branch 'wordNavigationForChineseText' into brailleOutputForChinese
9ab3dba0
CrazySteve0605 fixup
d1373b20
pre-commit-ci[bot] Pre-commit auto-fix
f98b1b19
pre-commit-ci[bot] Pre-commit auto-fix
80b04729
michaelDCurran Merge branch 'master' into try-chineseWordSegmentation-staging
4a4b1aff
michaelDCurran Merge branch 'try-chineseWordSegmentation-staging' into wordNavigatio…
085ba2fc
pre-commit-ci[bot] Pre-commit auto-fix
d55d077d
CrazySteve0605 fixup
20830956
CrazySteve0605 make word segmentation module reinitialized after settings are saved
d32549f3
pre-commit-ci[bot] Pre-commit auto-fix
b8ace769
michaelDCurran Merge branch 'master' into try-chineseWordSegmentation-staging
c86b760b
michaelDCurran Merge branch 'try-chineseWordSegmentation-staging' into wordNavigatio…
042b7788
CrazySteve0605 fixup
d2714a36
CrazySteve0605 remove duplicate importing lines
db90fff0
michaelDCurran Merge pull request #18735 from CrazySteve0605/wordNavigationForChines…
b50a0d51
michaelDCurran Merge branch 'try-chineseWordSegmentation-staging' into brailleOutput…
0d27c910
pre-commit-ci[bot] Pre-commit auto-fix
9cafffb0
michaelDCurran Merge pull request #18865 from CrazySteve0605/brailleOutputForChinese
29d9f5ab
seanbudd seanbudd requested a review 108 days ago
seanbudd seanbudd requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 108 days ago
seanbudd seanbudd requested a review from SaschaCowley SaschaCowley 108 days ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2025-11-05
cary-rowen
seanbudd seanbudd marked this pull request as draft 107 days ago
seanbudd
seanbudd seanbudd changed the title Merge Chinese Word Segmentation work [WIP] Merge Chinese Word Segmentation work 107 days ago
seanbudd seanbudd added conceptApproved
cary-rowen
CrazySteve0605
seanbudd
cary-rowen
seanbudd seanbudd added merge-early

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone