feat(text): apply global cleaners to symbol sets
refactors text normalization code out to utils
so that it can be used before initializing the text config
and ensures that the phonemizer applies the same normalization as
the final transducer since ipatok outputs NFD
fixes https://github.com/roedoejet/EveryVoice/issues/407