cog
df576ff5 - feat: Replace Pydantic with native Python dataclasses for cog.BaseModel (#2681)

Commit
100 days ago
feat: Replace Pydantic with native Python dataclasses for cog.BaseModel (#2681) * chore: remove pydantic-based cog implementation Remove the legacy pydantic-based Python SDK to prepare for the dataclass-based implementation. This includes all server code, type definitions, and associated tests. * feat: add dataclass-based cog implementation Replace pydantic with a pure dataclass-based implementation: - Type inspection without pydantic overhead - Schema generation using native Python types - Custom coder system for complex type serialization - API compatible with existing predictors * refactor: simplify to single cog wheel Remove multi-wheel complexity now that pydantic-based cog is replaced: - pkg/wheels: embed only the cog wheel, remove cog-dataclass - pkg/dockerfile: simplify wheel installation to single embedded wheel - integration-tests: remove cog_dataclass condition - CI: remove dataclass-specific test matrix entries - tox: remove pydantic version matrix - mise: consolidate coglet-python test task * test: remove pydantic-specific integration tests Delete tests that specifically test pydantic 1.x/2.x behavior which is no longer relevant with the dataclass-based implementation. * test: unskip complex_output test The dataclass implementation handles Pydantic BaseModel outputs via duck-typing - it checks for model_dump() (v2) or dict() (v1) methods in cog/json.py:make_encodeable(). Users can still use Pydantic for their own model types. * test: unskip setup_subprocess_multiprocessing test Remove obsolete skips - the test uses Python 3.10 which is supported. Verified passing with both Python and Rust coglet servers. * test: unskip torch_baseimage tests Remove obsolete skips - the tests use Python 3.10 which is supported. These are slow tests that will run in CI (not -short mode). * test: unskip build_cog_version_match test Remove obsolete skips. This test verifies cog version in base images. Verified passing with both Python and Rust coglet servers. * test: remove coglet_alpha skips from integration tests coglet_alpha is no longer a supported configuration - remove all skips. * refactor(coglet): remove pydantic-specific code paths - Simplify format_validation_error to use cog's already-formatted errors - Remove unwrap_pydantic_serialization_iterators (no longer needed) - Remove schema_via_fastapi fallback, use cog._schemas directly - Update Runtime enum: remove Pydantic variant, rename NonPydantic to Cog - Update SdkImplementation: remove Pydantic/Dataclass, use Cog/Unknown - Update detection to check for cog._adt module - Update comments to remove pydantic references * test: update complex_output to use cog.BaseModel instead of pydantic.BaseModel pydantic.BaseModel outputs are no longer supported. Users should use cog.BaseModel (a dataclass) or @dataclass for structured outputs. * feat: implement user-defined healthcheck support for Python server Add support for user-defined healthcheck() method on predictors: - Add Healthcheck event type to eventtypes.py - Add get_healthcheck() helper to predictor.py - Add healthcheck() method to Worker and _ChildWorker classes - Add healthcheck() to PredictionRunner - Update /health-check endpoint to call user healthcheck - Add UNHEALTHY status to Health enum Features: - Sync and async healthcheck methods supported - 5 second timeout for healthcheck execution - Returns UNHEALTHY with error details on failure/timeout/exception Remove [cog_dataclass] skip from healthcheck integration tests. * feat(coglet): implement user-defined healthcheck support Add healthcheck support to coglet-rust: Protocol: - Add ControlRequest::Healthcheck and ControlResponse::HealthcheckResult - Add HealthcheckStatus enum (Healthy/Unhealthy) Orchestrator: - Add HealthcheckResult type with healthy()/unhealthy() constructors - Add healthcheck() method to Orchestrator trait - Implement request/response flow via control channel - Add semaphore to prevent concurrent healthchecks (skip if busy) - Handle healthcheck results in event loop HTTP: - Add HealthResponse enum (includes transient UNHEALTHY state) - Update /health-check to call user healthcheck when ready - Return user_healthcheck_error in response on failure Worker: - Add healthcheck() to PredictHandler trait (default: healthy) - Handle Healthcheck requests in worker event loop Python integration (coglet-python): - Add has_healthcheck() and is_healthcheck_async() to PythonPredictor - Implement healthcheck_sync() with ThreadPoolExecutor + 5s timeout - Implement healthcheck_async() with asyncio.wait_for + 5s timeout - Wire up in PythonPredictHandler::healthcheck() * test: add async healthcheck integration tests and enable coglet_rust - Remove [coglet_rust] skip from existing sync healthcheck tests - Add async healthcheck tests: - healthcheck_async_custom: async healthcheck returning True - healthcheck_async_unhealthy: async healthcheck returning False - healthcheck_async_exception: async healthcheck raising exception - healthcheck_async_timeout: async healthcheck timing out (>5s) * fix: resolve pyright type errors and lint issues Python type fixes: - _adt.py: Fix type hints for PrimitiveType methods to handle Any - config.py: Add type arguments to dict types - input.py: Add cast for default_factory, add type ignore for field() - coder.py: Rename factory parameter from cls to tpe (static method) - coders/*.py: Match renamed parameter in factory method overrides - http.py: Add type ignores for dynamic FastAPI types and coglet module - _inspector.py: Remove unused imports, add 'from None' to re-raises Makefile: - Update tox env from typecheck-pydantic2 to typecheck (pydantic removed) Cleanup: - Remove unused warnings import from _inspector.py - Remove experimental coders warning * fix(coglet): correct healthcheck timeout message format and harness - Change timeout format from {} to {:.1} to output '5.0' instead of '5' - Update test harness waitForServer to accept UNHEALTHY and BUSY as valid 'ready' states * docs: remove all Pydantic references - Remove Pydantic compat code from cog.Path - Update README, docs/python.md, docs/llms.txt - Clean up comments referencing pydantic * chore: remove pydantic dependency and cog-dataclass scaffold - Remove pydantic from dependencies in pyproject.toml - Simplify dependencies to minimal set - Remove PYDANTIC_V2 constant from pyright config - Delete cog-dataclass/ directory (was scaffold, code now in python/cog/) * fix: CI failures - lint, Go tests, and CodeQL warning - Remove unused Type import from types.py - Remove pydantic from Go dockerfile test expectation - Remove pydantic comment from requirements_test.go - Fix pyright warnings in openapi_schema.py (use Any type) - Sanitize validation error messages to first line only * test: fix healthcheck timeout tests to use trigger-based approach Use prediction to trigger slow healthcheck mode instead of relying on call counting, which was flaky due to harness also calling healthcheck. * fix: remove unused imports in openapi_schema.py * fix: properly timeout sync healthchecks in Python server Use ThreadPoolExecutor with shutdown(wait=False) to avoid blocking when sync healthcheck exceeds timeout. Previously the context manager would wait for the thread to complete even after timeout. * fix: sanitize validation error messages to prevent info leakage Add _sanitize_validation_message() that only passes through known safe validation patterns (Field required, Invalid value, fails constraint, does not match regex/choices). Unknown messages are replaced with generic 'Invalid value' to prevent potential stack trace or internal details from reaching clients. This addresses CodeQL security warning about information exposure.
Author
Parents
Loading