cog
c889973e - Make Worker interface non-blocking

Commit
1 year ago
Make Worker interface non-blocking As step one in support for concurrent prediction execution (e.g. for batching models), this change makes the `Worker` interface non-blocking, bringing it closer to interfaces like `concurrent.futures.ThreadPoolExecutor` which are doing a similar but much more generic job. The blocking interface this replaces was conceptually simpler, but given that the worker was primarily used through an non-blocking HTTP interface (the `Prefer: respond-async` header is what Replicate uses when running Cog predictions in production) we had to bend over backwards to use it. In particular, that meant: - worker had to yield heartbeat events to return control to the caller periodically - we had to create another multi-threaded component, `PredictionRunner`, to present a non-blocking interface over the top of the blocking worker interface In this commit, changes are restricted to `Worker`'s interface, and we hack together whatever we need to in `PredictionRunner` to keep tests passing. A future commit will replace the runner code altogether. Both `setup` and `predict` now return `concurrent.futures.Future` objects, which complete when the prediction is completed. Heartbeat events are removed altogether. Consumers of worker are expected to make use of its `subscribe` method to allow them to receive all the events emitted during a setup or predict run. This also addresses an oversight that's been here since `Worker` was first written: we now record something useful in the prediction logs if a `BaseException` is raised by predict. Co-Authored-By: F <f@replicate.com>
Author
Committer
Parents
Loading