Update architecture docs to match the codebase (#2856)
* docs(architecture): remove legacy runtime, establish coglet as sole runtime
The legacy Python/FastAPI runtime has been removed and coglet is the
only runtime. This updates the architecture docs to reflect that:
- Delete architecture/legacy/ and architecture/ffi/ subdirectories
- Create canonical 03-prediction-api.md and 04-container-runtime.md
- Fix false claim that wheels are embedded in the Go binary
- Fix pydantic claims (schema gen uses inspect + custom ADT dataclasses)
- Fix stale file references (base_predictor.py, pkg/api/, etc.)
- Mark static schema path as experimental
- Fix compatibility matrix filenames, Swagger UI ghost, cog.BaseModel name
* docs(architecture): add skill, trim implementation detail from 03/04
Add skills/updating-architecture-docs/ to capture the intent and process
for maintaining architecture docs -- bridge from concepts to code, not
code summaries.
Apply the skill's principles to 03-prediction-api.md and
04-container-runtime.md: cut obvious Rust snippets that just restate
code (status lifecycle, idempotent PUT, error handling), keep non-obvious
ones that convey meaning prose can't (RAII guard, permit model). Replace
exhaustive file listings and code reference tables with package-level
pointers. Slim down component ownership tree to conceptual description.
* docs(architecture): fix health states, complete IPC protocol, document missing features
Fix health state machine to start at UNKNOWN (not STARTING), add
UNHEALTHY state for user-defined healthchecks. Complete IPC protocol
tables with all message variants (~8 were missing). Document four
previously undocumented features: input spilling, file outputs, custom
metrics, user-defined healthchecks. Expand env var table and Go package
listing. Add hidden CLI commands. Fix crates/README.md duplicate line.
Update skill to prefer important packages over exhaustive listings.
* docs(architecture): add predictor lifecycle and life-of-a-prediction sections
Addresses the question 'what is the lifetime of BasePredictor and what can I do with it at each stage?' -- previously undocumented.
- Predictor Lifecycle in 04: singleton instance, setup() once, self persists,
no teardown, concurrency semantics, crash is terminal (no respawn)
- Life of a Prediction in 04: 9-step request-to-response walkthrough with
error handling
- Cross-references from 01 setup()/predict() sections
- Fixes pre-existing incorrect DEFUNCT health claim in Why Two Processes
* docs(architecture): add Python SDK as a named component in overview
The SDK was in the diagram but disconnected and missing from the Components section and terminology table. Now it's a first-class entry so people can distinguish 'the CLI' from 'the SDK' from 'coglet'.