EveryVoice
Improve handling of inputs
#348
Merged

Improve handling of inputs #348

roedoejet merged 36 commits into main from dev.ap/inputs
roedoejet
roedoejet roedoejet force pushed from 9816eb48 to 8d696fb1 252 days ago
roedoejet roedoejet force pushed from 2ce7fbc1 to 15e32261 251 days ago
codecov
roedoejet roedoejet changed the title [DRAFT] Improve handling of inputs Improve handling of inputs 246 days ago
roedoejet roedoejet force pushed from 271778a8 to 73251494 246 days ago
github-actions
roedoejet roedoejet requested a review from joanise joanise 246 days ago
roedoejet roedoejet requested a review from SamuelLarkin SamuelLarkin 246 days ago
roedoejet roedoejet force pushed from def5f34c to 469b5154 246 days ago
roedoejet roedoejet requested a review from MENGZHEGENG MENGZHEGENG 246 days ago
roedoejet roedoejet requested a review from marctessier marctessier 246 days ago
roedoejet roedoejet force pushed from 469b5154 to 3bff09bb 246 days ago
marctessier
roedoejet
roedoejet roedoejet force pushed from 5b19333e to c89561f8 244 days ago
roedoejet roedoejet force pushed from c89561f8 to 03df507d 239 days ago
roedoejet roedoejet force pushed from 03df507d to 4966efd1 239 days ago
MENGZHEGENG
joanise
joanise
roedoejet
MENGZHEGENG
roedoejet
SamuelLarkin
SamuelLarkin requested changes on 2024-04-03
SamuelLarkin
SamuelLarkin requested changes on 2024-04-05
roedoejet roedoejet force pushed from bc98854c to 1483d560 236 days ago
roedoejet roedoejet force pushed from 1483d560 to 8178cfc7 236 days ago
roedoejet roedoejet force pushed from a3ecf064 to e420858d 236 days ago
roedoejet
roedoejet commented on 2024-04-09
joanise
joanise commented on 2024-04-10
roedoejet refactor: initialize text processor from text config
ad8bfe94
roedoejet feat: add g2p module and updated puncutation handling
200aafe2
roedoejet refactor: large refactor of text processor
761c9b7e
roedoejet feat: write string-encoded character and phone token sequences to fil…
ce5da77e
roedoejet fix(tests): add empty string join character when decoding
f1a314f2
roedoejet refactor: multiple refactors
a011437f
roedoejet feat(wizard): update handling of input text
5f524b5d
roedoejet fix(tests): fix wizard unit tests
027cdca5
roedoejet feat(fs2.cli): add text_type specification
67516565
roedoejet feat(wizard): add progress bar for grapheme/phoneme discovery
9cf44e87
roedoejet test: add doctests to text suite
58371b63
roedoejet fix: sort fieldnames to improve filelist readability
057b9fad
joanise perf: mind your imports and keep the cli fast
db1bc0e5
joanise chore: update submodules for import refactoring
a0adcfcb
roedoejet feat: remove duplicates in symbols by default
9a3b6aec
roedoejet fix: remove lowercase ascii from default symbols
5262b9f6
roedoejet fix: missing ascii characters in preprocessing test
14636e4b
roedoejet fix: only encode character or phone strings when they exist
3ecd69d3
roedoejet fix: pin typer to less than 0.12.0
3b09521c
roedoejet refactor: defining the text representation is unnecessary
76a869bd
roedoejet refactor: change csv to psv
0840d63c
roedoejet refactor: add changes suggested by Sam
b4ede2a3
roedoejet fix: pin typer to 0.9.0
b0836be6
roedoejet fix: properly process g2p for multi-lingual filelists
8eb96be3
SamuelLarkin feat: remove punctuation from automatically guessed characters
a26453a7
SamuelLarkin docs: alternate method
8ed5f211
SamuelLarkin feat: message the user about punctuation in character set
76df2f2c
SamuelLarkin feat: simpler iteration over fields and skipping punctuation
19b8112f
SamuelLarkin feat: helper functions
9446382b
SamuelLarkin fix: doctests
1933283f
roedoejet fix(pfs): phonological features should apply punctuation transformation
c43da88e
roedoejet refactor: save pfs to pfs folder, not text
36fcaab5
roedoejet refactor: use dict mappings for symbol and id
6f4b0780
roedoejet refactor: set punctuation rule application everywhere
17821d03
roedoejet refactor: apply fixes and refactors suggested by @joanise
f656bf2f
roedoejet roedoejet force pushed from 8e693281 to f656bf2f 231 days ago
roedoejet
joanise
joanise approved these changes on 2024-04-11
roedoejet fix: filter text data based on target training representation
887a6332
roedoejet roedoejet requested a review from SamuelLarkin SamuelLarkin 230 days ago
roedoejet roedoejet merged 4bc58614 into main 230 days ago
roedoejet roedoejet deleted the dev.ap/inputs branch 230 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone