PR #348 Improve handling of inputs

refactor: initialize text processor from text config

roedoejet committed 231 days ago

feat: add g2p module and updated puncutation handling

roedoejet committed 231 days ago

refactor: large refactor of text processor

roedoejet committed 231 days ago

feat: write string-encoded character and phone token sequences to filelist

roedoejet committed 231 days ago

fix(tests): add empty string join character when decoding

roedoejet committed 231 days ago

refactor: multiple refactors

roedoejet committed 231 days ago

feat(wizard): update handling of input text

roedoejet committed 231 days ago

fix(tests): fix wizard unit tests

roedoejet committed 231 days ago

feat(fs2.cli): add text_type specification

roedoejet committed 231 days ago

feat(wizard): add progress bar for grapheme/phoneme discovery

roedoejet committed 231 days ago

test: add doctests to text suite

roedoejet committed 231 days ago

fix: sort fieldnames to improve filelist readability

roedoejet committed 231 days ago

perf: mind your imports and keep the cli fast

roedoejet committed 231 days ago

chore: update submodules for import refactoring

roedoejet committed 231 days ago

feat: remove duplicates in symbols by default

roedoejet committed 231 days ago

fix: remove lowercase ascii from default symbols

roedoejet committed 231 days ago

fix: missing ascii characters in preprocessing test

roedoejet committed 231 days ago

fix: only encode character or phone strings when they exist

roedoejet committed 231 days ago

fix: pin typer to less than 0.12.0

roedoejet committed 231 days ago

refactor: defining the text representation is unnecessary

roedoejet committed 231 days ago

refactor: change csv to psv

roedoejet committed 231 days ago

refactor: add changes suggested by Sam

roedoejet committed 231 days ago

fix: pin typer to 0.9.0

roedoejet committed 231 days ago

fix: properly process g2p for multi-lingual filelists

roedoejet committed 231 days ago

feat: remove punctuation from automatically guessed characters

roedoejet committed 231 days ago

docs: alternate method

roedoejet committed 231 days ago

feat: message the user about punctuation in character set

roedoejet committed 231 days ago

feat: simpler iteration over fields and skipping punctuation

roedoejet committed 231 days ago

feat: helper functions

roedoejet committed 231 days ago

fix: doctests

roedoejet committed 231 days ago

fix(pfs): phonological features should apply punctuation transformation

roedoejet committed 231 days ago

refactor: save pfs to pfs folder, not text

roedoejet committed 231 days ago

refactor: use dict mappings for symbol and id

roedoejet committed 231 days ago

refactor: set punctuation rule application everywhere

roedoejet committed 231 days ago

refactor: apply fixes and refactors suggested by @joanise

roedoejet committed 231 days ago

fix: filter text data based on target training representation

roedoejet committed 230 days ago

EveryVoice Improve handling of inputs #348 Merged

EveryVoice
Improve handling of inputs
#348

Merged