Filter .cog/ from Docker build context, centralize build state (#3000)
* feat: filter .cog/ from Docker build context, stabilize build cache dir
Switch from LocalDirs to LocalMounts with fsutil filtering so .cog/ is
never sent to the Docker daemon as build context. This prevents weight
blobs, mount dirs, and stale tmp dirs from inflating context size.
- Add ExcludePatterns and BuildCacheDir to ImageBuildOptions
- Introduce cog_build named build context for staging artifacts
- Generated Dockerfiles use COPY --from=cog_build instead of
project-relative paths into .cog/tmp/
- Replace timestamp-based .cog/tmp/build<ts>/ with stable .cog/build/
so COPY paths are deterministic across builds (fixes layer cache busting)
- Move bundled schema and weights manifest into .cog/build/
- Remove checkCompatibleDockerIgnore (no longer needed -- .cog is
excluded at the session level, users can freely ignore it)
- Write Dockerfile to BuildCacheDir when provided, eliminating a
separate temp dir
* refactor: introduce dotcog.Dir to own .cog/ lifecycle
Replace scattered filepath.Join + global.CogBuildArtifactsFolder patterns
with a structured dotcog.Dir that manages path resolution, directory
creation, advisory locking, and cleanup for the .cog/ project directory.
- Add pkg/dotcog/ with Open, OpenTemp, Path, FilePath, Lock, Close
- Wire dotcog.Dir through Source, Build, weights import, and debug
- Add Source.Close() and defer src.Close() in all CLI commands
- Lock API uses idiomatic Lock(ctx) (release, err) + defer release()
- Remove pkg/lockfile/, pkg/dockerignore/, dockercontext/build_tempdir
- Remove global.CogBuildArtifactsFolder (replaced by dotcog.Name)
- Replace hardcoded .cog/ path constants with buildPaths struct
- Add .cog to cog init .dockerignore template
- Simplify example .dockerignore files (exclude .cog/ entirely)
* cleanup: tighten dotcog.Dir lifecycle and remove dead code
- Close() uses sync.Once for idempotent double-call safety
- OpenTemp registers temp dir removal via onClose instead of a temp flag
- Add TempPath() for scratch dirs (build/) that clean up on Close
- Lock() returns explicit error when !locked && ctx.Err() == nil
- Remove generator.Cleanup() -- dotcog owns .cog/build/ lifecycle
- Remove Cleanup() from Generator interface
- Unexport OnClose (no external callers), remove unused Remove()
- Fix stale package doc (WithLock -> Lock)
- Remove accidentally committed plans/ directory
* cleanup: wipe build dir on start, remove stale comments, fix formatting
- TempPath wipes existing contents before creating the directory,
so stale artifacts from crashed builds don't leak into new ones
- Remove surgical os.Remove calls for individual build artifacts
(TempPath's wipe handles it)
- Remove stale Cleanup() reference in field comment
- Remove editorial comment on Generator interface
- Fix struct literal alignment (gofmt)
* cleanup: remove dead dockercontext package and generator factory
- Delete pkg/dockercontext/ (only export was StandardBuildDirectory=".")
- Inline "." in baseimage.go
- Delete generator_factory.go and its test
- Inline NewStandardGenerator at 3 call sites (build.go, debug.go, base.go)
* cleanup: remove redundant .cog from .dockerignore files
.cog/ is excluded at the fsutil session level via ExcludePatterns,
so .dockerignore entries are redundant for cog build.
* refactor: replace .dockerignore mutation with ExcludePatterns for separate weights
Separate-weights builds no longer backup/write/restore .dockerignore on
disk. Weight file exclusions for the runner build are passed as fsutil
ExcludePatterns instead, matching how .cog/ is already excluded.
- Remove backupDockerignore, writeDockerignore, restoreDockerignore
- Remove makeDockerignoreForWeightsImage (weights Dockerfile uses
explicit COPY paths, .dockerignore was irrelevant)
- Remove DockerignoreHeader constant and makeDockerignoreForWeights
- GenerateModelBaseWithSeparateWeights returns []string exclude patterns
instead of a dockerignore string
- buildRunnerImage takes excludePatterns parameter
* docs: update architecture doc for .cog/build/ and cog_build context
.cog/tmp/ no longer exists; wheels are staged in .cog/build/ and
referenced via the cog_build named build context.
* cleanup: unify build helpers, use dotcog.Name consistently
- Merge buildWeightsImage + buildRunnerImage into buildContextImage
(identical except for excludePatterns parameter)
- Use dotcog.Name constant instead of hardcoded ".cog" in
bundleDockerfile and test files
- Document BuildCacheDir / BuildContexts["cog_build"] relationship
- Clarify Source.DotCog nil comment
* fix: write schema/weights to .cog/ root for volume-mount visibility
When cog predict/train/serve volume-mount the project dir at /src,
the host's .cog/ shadows the image's .cog/ layer. Coglet looks for
.cog/openapi_schema.json on disk, so it needs to exist on the host
filesystem -- not just in the image.
Write schema and weights manifest to both .cog/build/ (for the
cog_build named context used by COPY --from=cog_build) and .cog/
root (for volume-mount visibility).
* address review: add dotcog tests, nil guard Source.Close, small cleanups
- Add 15 unit tests for dotcog (Open, OpenTemp, Path, TempPath,
FilePath, Close nil-safety/idempotency/LIFO/error joining, Lock
acquire/release/blocking/cancellation)
- Nil guard in Source.Close for NewSourceFromConfig path
- Wrap filepath.Abs error in dotcog.Open
- Use slices.Backward in Close, drop redundant nil check before
errors.Join
- Remove stale pkg/dockercontext/ from architecture/06-cli.md
* use .cog/ trailing slash in exclude pattern
* refactor: atomic writes via files.AtomicWrite, trailing slash on exclude pattern
- Move AtomicWrite and Copy into pkg/util/files (out of build.go)
- Write schema/weights atomically to .cog/ root, then copy to
.cog/build/ for the named build context -- one generation, no
partial files visible to concurrent readers
- Use .cog/ trailing slash in exclude pattern
- AtomicWrite does fsync before rename (modeled after tailscale/atomicfile)
---------
Co-authored-by: Mark Phelps <209477+markphelps@users.noreply.github.com>