Upgrading community frameworks to `audio-to-audio`. (#94)
* Add `audio-to-audio` and remove `audio-source-separation` to this
package.
* Adding `audio-to-audio` task.
- Removed `audio-source-separation` (it's not `audio-to-audio` and it is
a bit more general). Also the previous version was flawed as there was
no way to return multiple channels.
- Decision to return base64 blobs in JSON. It's not super shiny but it
does the work and enabled easy ways to add annotations for audio streams
(like the instrument labels for source separation).
- Added GZipMiddleware within common, this help base64 encoded json be
omewhat ok in terms of size.
- Known caveat. Currently we only send 1 channel to the input of the
pipeline. At the moment of this commit, no model requires stereo or
multi channel input, so keeping that change for later.
- community framework update will follow.
- Because of dockerized versionning we can safely delete
`audio-source-separation`. Old `api-inference-community` will continue
to support those and the hub too. But if you want to use > 0.0.7 then
you will need to migrate.
* Adding type checks in tests for `audio-to-audio`.
* Upgrading community frameworks to `audio-to-audio`.
- Deleted `audio-source-separation` as much as possible.
Necessary to keep some legacy code, because the
`audio-source-separation` tag and widget still exist.
- Targeted `speechbrain` and `asteroid` the only 2 frameworks
to implement `audio-source-separation` at this point.
Reference: https://github.com/huggingface/huggingface_hub/pull/76
* Rebase.
Co-authored-by: cem <csubakan@gmail.com>
* Fixing black.
* isort.
* Upgrading espnet + speechbrain to fallable start.
Co-authored-by: cem <csubakan@gmail.com>