docs: Document how to run Cog models as Docker images (#3064)
* Document how to run Cog models as Docker images (#1804)
Expand deploy.md with a step-by-step guide covering:
- Building and running Docker images (docker run, cog serve, cog run)
- Making predictions via the HTTP API with the correct input wrapper
- Passing file inputs via URLs and base64 data URLs
- Getting output files (data URLs and output_file_prefix)
Closes #1804
* Address PR feedback: remove cog serve image arg, fix output URL scheme
- cog serve does not accept an image name argument; remove the example
- output_file_prefix response URL preserves the https scheme
* Update docs/deploy.md
Co-authored-by: ask-bonk[bot] <249159057+ask-bonk[bot]@users.noreply.github.com>
Signed-off-by: Anish Sahoo <anishsahoo2005@gmail.com>
* Update docs/llms.txt
Co-authored-by: ask-bonk[bot] <249159057+ask-bonk[bot]@users.noreply.github.com>
Signed-off-by: Anish Sahoo <anishsahoo2005@gmail.com>
* Fix stale rust cache resulting in Python patch version divergence when checking stubs
PyO3 bakes the absolute libpython dir into target/, but python-version
floats to the latest patch. A patch bump on the runner left a stale -L
path in the cached build, breaking stub_gen linking (-lpython3.13 not
found). Add the resolved patch to the rust-cache key so a bump busts the
cache and PyO3 relinks against the present interpreter.
* Address PR feedback: document --upload-url flow, fix port and cog run guidance
- Replace unsupported per-request output_file_prefix with the server-startup --upload-url flow (cog serve and raw Docker command override)
- Add note that prediction examples use port 5001 (Docker); cog serve defaults to 8393
- Move cog run into its own one-off prediction section since it doesn't start a server
- Drop cog run from the stop-the-server instructions
* docs: fix http.md file uploads to document --upload-url, not output_file_prefix
output_file_prefix is not implemented anywhere in coglet, the Python server, or the CLI - it was silently ignored. Uploads are driven solely by the --upload-url server startup flag (for both sync and async predictions). Also corrected the upload request format: a raw PUT to {upload-url}/{filename} with the file's Content-Type, not a multipart/form-data POST.
---------
Signed-off-by: Anish Sahoo <anishsahoo2005@gmail.com>
Co-authored-by: ask-bonk[bot] <249159057+ask-bonk[bot]@users.noreply.github.com>