nvda
940ab5de - New speech framework including callbacks, beeps, sounds, profile switches and prioritized queuing (#7599)

Commit

6 years ago

New speech framework including callbacks, beeps, sounds, profile switches and prioritized queuing (#7599) * Enhance nvwave to simplify accurate indexing for speech synthesizers. 1. Add an onDone argument to WavePlayer.feed which accepts a function to be called when the provided chunk of audio has finished playing. Speech synths can simply feed audio up to an index and use the onDone callback to be accurately notified when the index is reached. 2. Add a buffered argument to the WavePlayer constructor. If True, small chunks of audio will be buffered to prevent audio glitches. This avoids the need for tricky buffering across calls in the synth driver if the synth provides fixed size chunks and an index lands near the end of a previous chunk. It is also useful for synths which always provide very small chunks. * Enhancements to config profile triggers needed for profile switching within speech sequences. 1. Allow triggers to specify that handlers watching for config profile switches should not be notified. In the case of profile switches during speech sequences, we only want to apply speech settings, not switch braille displays. 2. Add some debug logging for when profiles are activated and deactivated. * Add support for callbacks, beeps, sounds, profile switches and utterance splits during speech sequences, as well as prioritized queuing. Changes for synth drivers: - SynthDrivers must now accurately notify when the synth reaches an index or finishes speaking using the new `synthIndexReached` and `synthDoneSpeaking` extension points in the `synthDriverHandler` module. The `lastIndex` property is deprecated. See below regarding backwards compatibility for SynthDrivers which do not support these notifications. - SynthDrivers must now support `PitchCommand` if they which to support capital pitch change. - SynthDrivers now have `supportedCommands` and `supportedNotifications` attributes which specify what they support. - Because there are some speech commands which trigger behaviour unrelated to synthesizers (e.g. beeps, callbacks and profile switches), commands which are passed to synthesizers are now subclasses of `speech.SynthCommand`. Central speech manager: - The core of this new functionality is the `speech._SpeechManager` class. It is intended for internal use only. It is called by higher level functions such as `speech.speak`. - It manages queuing of speech utterances, calling callbacks at desired points in the speech, profile switching, prioritization, etc. It relies heavily on index reached and done speaking notifications from synths. These notifications alone trigger the next task in the flow. - It maintains separate queues (`speech._ManagerPriorityQueue`) for each priority. As well as holding the pending speech sequences for that priority, each queue holds other information necessary to restore state (profiles, etc.) when that queue is preempted by a higher priority queue. - See the docstring for the `speech._SpeechManager` class for a high level summary of the flow of control. New/enhanced speech commands: - `EndUtteranceCommand` ends the current utterance at this point in the speech. This allows you to have two utterances in a single speech sequence. - `CallbackCommand` calls a function when speech reaches the command. - `BeepCommand` produces a beep when speech reaches the command. - `WaveFileCommand` plays a wave file when speech reaches the command. - The above three commands are all subclasses of `BaseCallbackCommand`. You can subclass this to implement other commands which run a pre-defined function. - `ConfigProfileTriggerCommand` applies (or stops applying) a configuration profile trigger to subsequent speech. This is the basis for switching profiles (and thus synthesizers, speech rates, etc.) for specific languages, math, etc. - `PitchCommand`, `RateCommand` and `VolumeCommand` can now take either a multiplier or an offset. In addition, they can convert between the two on demand, which makes it easier to handle these commands in synth drivers based on the synth's requirements. They also have an `isDefault` attribute which specifies whether this is returning to the default value (as configured by the user). Speech priorities: `speech.speak` now accepts a `priority` argument specifying one of three priorities: `SPRI_NORMAL` (normal priority), `SPRI_NEXT` (speak after next utterance of lower priority)or `SPRI_NOW` (speech is very important and should be spoken right now, interrupting lower priority speech). Interrupted lower priority speech resumes after any higher priority speech is complete. Refactored functionality to use the new framework: - Rather than using a polling generator, spelling is now sent as a single speech sequence, including `EndUtteranceCommand`s, `BeepCommand`s and `PitchCommand`s as appropriate. This can be created and incorporated elsewhere using the `speech.getSpeechForSpelling` function. - Say all has been completely rewritten to use `CallbackCommand`s instead of a polling generator. The code should also be a lot more readable now, as it is now classes with methods for the various stages in the process. Backwards compatibility for old synths: - For synths that don't support index and done speaking notifications, we don't use the speech manager at all. This means none of the new functionality (callbacks, profile switching, etc.) will work. - This means we must fall back to the old code for speak spelling, say all, etc. This code is in the `speechCompat` module. - This compatibility fallback is considered deprecated and will be removed eventually. Synth drivers should be updated ASAP. Deprecated/removed: - `speech.getLastIndex` is deprecated and will simply return None. - `IndexCommand` should no longer be used in speech sequences passed to `speech.speak`. Use a subclass of `speech.BaseCallbackCommand` instead. - In the `speech` module, `speakMessage`, `speakText`, `speakTextInfo`, `speakObjectProperties` and `speakObject` no longer take an `index` argument. No add-ons in the official repository use this, so I figured it was safe to just remove it rather than having it do nothing. - `speech.SpeakWithoutPausesBreakCommand` has been removed. Use `speech.EndUtteranceCommand` instead. No add-ons in the official repository use this. - `speech.speakWithoutPauses.lastSentIndex` has been removed. Use a subclass of `speech.BaseCallbackCommand` instead. No add-ons in the official repository use this. * Update the espeak synth driver to support the new speech framework. * Update the oneCore synth driver to support the new speech framework. * Update comtypes to version 1.1.3. This is necessary to handle events from SAPI 5, as one of the parameters is a decimal which is not supported by our existing (very outdated) version of comtypes . comtypes has now been added as a separate git submodule. * Update the sapi5 synth driver to support the new speech framework. * Fix submodule URL for comtypes. Oops! * Ensure eSpeak emits index callbacks even if the espeak event chunk containing the index is 0 samples in length. This allows sayAll to function with eSpeak. this may have broken due to a recent change in eSpeak I'm guessing. * Remove some debug print statements * Ensure eSpeak sets its speech parameters back to user-configured values if interupted while speaking ssml that changes those parameters. * Add a 'priority' keyword argument to all speech.speak* functions allowing the caller to provide a speech priority. * Alerts in Chrome and Firefox now are spoken with a higher speech priority, and no longer cancel speech beforehand. This means that the alert text will be spoken straight away, but will not interrupt other speech such as the new focus. * Unit tests: fake speech functions now must take a 'priority' keyword argument to match the real functions. * Remove speech compat * synthDriverHandler.handleConfigProfileSwitch was renamed to handlePostConfigProfileSwitch. Ensure that speech uses this name now. * synthDriverHandler.handlePostConfigProfileSwitch: reset speech queues if switching synths, so that the new synth has entirely new speech state. This behaviour can however be disabled by providing resetSpeechIfNeeded to false on this function. Speechmanager internally does this when dealing with configProfileSwitch commands in speech sequences. # Please enter the commit message for your changes. Lines starting * Remove the audioLogic synthesizer due to its extremely low usage. If someone does require this, it could be updated and provided as an add-on. * Speech: correct indentation, which allows profile switching in a speech eequence to work again. * Sapi4 synthDriver: very basic conversion to speechRefactor supporting synthIndexReached and synthDoneSpeaking, though for some sapi4 synths, indexing could be slightly early. * Convert system test synthDriver to speechRefactor so that indexing and doneSpeaking notifications are fired. * sayAllhandler: move trigger handling into _readText as recommended. * Remove some more speech compat stuff. * Speech: BaseCallbackCommand's run method is now abstract. Note that CallbackCommand now has been changed slightly so it implements its own run method which executes the callback passed to the command rather than run being overridden directly. BeepCommand also now calls playWaveFile with async set to True, not False so that it does not block. finally, BaseCallbackCommand's run method docstring notes that the main thread is blocked. * Speech: fix up some docstrings.

References

#7599 - New speech framework including callbacks, beeps, sounds, profile switches and prioritized queuing

Author

jcsteh

Committer

michaelDCurran

Parents

09d31179

nvda 940ab5de - New speech framework including callbacks, beeps, sounds, profile switches and prioritized queuing (#7599)

nvda
940ab5de - New speech framework including callbacks, beeps, sounds, profile switches and prioritized queuing (#7599)