chore(coglet): Rust safety improvements and wheel platform fix (#2717)
* chore(coglet): pass transport info from Init
* chore(coglet): add Fatal IPC message and panic hook for worker shutdown
Any panic in the worker (including mutex poisoning) now:
- sends ControlResponse::Fatal { reason } to parent via panic hook
- parent poisons all slots and fails in-flight predictions
- worker aborts (no unwinding through FFI/Python)
.expect() at lock sites is the correct idiom: the panic hook
handles IPC notification and hard abort automatically.
* chore(coglet): move slot poisoning to pool level
Slot poisoning is a property of the slot, not the prediction. A slot
can be poisoned regardless of whether a prediction is active on it.
- Add PermitPool::poison(slot_id) with per-slot AtomicBool flags
- try_acquire/acquire skip poisoned permits (lazy filtering)
- PermitIdle::drop checks poison flag, skips returning to pool
- Orchestrator poisons slots directly on Failed/Fatal via pool
- Remove slot_poisoned from Prediction (wrong abstraction)
- Remove PredictionSlot::into_poisoned (service always uses into_idle)
- Service send-failure path poisons at pool level before into_idle
* chore(coglet): coalesce concurrent healthcheck requests
Replace the semaphore + fake-healthy-when-in-progress pattern with
proper request coalescing. All concurrent callers now wait on the
same in-flight healthcheck and receive the real result.
- Remove healthcheck_semaphore from OrchestratorHandle
- Event loop uses Vec<Sender> instead of Option<Sender>
- Only one ControlRequest::Healthcheck sent to worker at a time
- Result broadcast to all waiters when it arrives
- Timed-out callers drop silently; healthcheck continues for others
- Fatal/crash paths fail all pending healthcheck waiters
* fix(cog): select coglet wheel by Docker target platform
The coglet wheel is a native Rust extension (platform-specific), but wheel
selection had no platform awareness — it globbed coglet-*.whl and took the
first match alphabetically. On ARM Mac with both wheels in dist/, the macOS
wheel sorted before the manylinux wheel and was incorrectly picked for a
Linux Docker build.
Additionally, StandardGenerator.GOOS/GOARCH were initialized from
runtime.GOOS (both — copy-paste typo), giving "darwin"/"darwin" on Mac
instead of the Docker build target "linux"/"amd64".
Changes:
- Add wheelPlatformTag() mapping GOARCH to wheel platform substring
- Add filterWheelsByPlatform() to filter wheel paths by platform tag
- Thread platformTag through findWheelInDist/findWheelInDistSilent
- GetCogletWheelConfig now takes targetArch parameter
- Fix StandardGenerator GOOS/GOARCH to match Docker target (linux/amd64)
- Cog SDK wheel (py3-none-any) passes empty platformTag (no filtering)