Improve handling of inputs #348
roedoejet
force pushed
from
9816eb48
to
8d696fb1
252 days ago
roedoejet
force pushed
from
2ce7fbc1
to
15e32261
251 days ago
roedoejet
changed the title [DRAFT] Improve handling of inputs Improve handling of inputs 246 days ago
roedoejet
force pushed
from
271778a8
to
73251494
246 days ago
roedoejet
force pushed
from
def5f34c
to
469b5154
246 days ago
roedoejet
force pushed
from
469b5154
to
3bff09bb
246 days ago
roedoejet
force pushed
from
5b19333e
to
c89561f8
244 days ago
roedoejet
force pushed
from
c89561f8
to
03df507d
239 days ago
roedoejet
force pushed
from
03df507d
to
4966efd1
239 days ago
roedoejet
force pushed
from
bc98854c
to
1483d560
236 days ago
roedoejet
force pushed
from
1483d560
to
8178cfc7
236 days ago
roedoejet
force pushed
from
a3ecf064
to
e420858d
236 days ago
refactor: initialize text processor from text config
ad8bfe94
feat: add g2p module and updated puncutation handling
200aafe2
refactor: large refactor of text processor
761c9b7e
feat: write string-encoded character and phone token sequences to fil…
ce5da77e
fix(tests): add empty string join character when decoding
f1a314f2
refactor: multiple refactors
a011437f
feat(wizard): update handling of input text
5f524b5d
fix(tests): fix wizard unit tests
027cdca5
feat(fs2.cli): add text_type specification
67516565
feat(wizard): add progress bar for grapheme/phoneme discovery
9cf44e87
test: add doctests to text suite
58371b63
fix: sort fieldnames to improve filelist readability
057b9fad
perf: mind your imports and keep the cli fast
db1bc0e5
chore: update submodules for import refactoring
a0adcfcb
feat: remove duplicates in symbols by default
9a3b6aec
fix: remove lowercase ascii from default symbols
5262b9f6
fix: missing ascii characters in preprocessing test
14636e4b
fix: only encode character or phone strings when they exist
3ecd69d3
fix: pin typer to less than 0.12.0
3b09521c
refactor: defining the text representation is unnecessary
76a869bd
refactor: change csv to psv
0840d63c
refactor: add changes suggested by Sam
b4ede2a3
fix: pin typer to 0.9.0
b0836be6
fix: properly process g2p for multi-lingual filelists
8eb96be3
feat: remove punctuation from automatically guessed characters
a26453a7
docs: alternate method
8ed5f211
feat: message the user about punctuation in character set
76df2f2c
feat: simpler iteration over fields and skipping punctuation
19b8112f
feat: helper functions
9446382b
fix: doctests
1933283f
fix(pfs): phonological features should apply punctuation transformation
c43da88e
refactor: save pfs to pfs folder, not text
36fcaab5
refactor: use dict mappings for symbol and id
6f4b0780
refactor: set punctuation rule application everywhere
17821d03
refactor: apply fixes and refactors suggested by @joanise
f656bf2f
roedoejet
force pushed
from
8e693281
to
f656bf2f
231 days ago
joanise
approved these changes
on 2024-04-11
fix: filter text data based on target training representation
887a6332
roedoejet
merged
4bc58614
into main 230 days ago
roedoejet
deleted the dev.ap/inputs branch 230 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub