transformers
0dd2f492 - Remove vendored distributed/ (2D context-parallel) stack from esmfold2

Commit
2 days ago
Remove vendored distributed/ (2D context-parallel) stack from esmfold2 The distributed/ package is a NVIDIA/MIT-licensed 2D context-parallel implementation of the folding trunk (DTensor + DeviceMesh + NCCL) for multi-GPU 6B inference. It is dropped from the port because: - It is not imported by any model code (core, config, __init__, or the experimental file) — fully inert in the package. - It is broken on import: all 7 files import from `projects.huggingface.transformers.models.esmfold2...` (the fork's internal monorepo path), so `import transformers.models.esmfold2.distributed` raises ModuleNotFoundError. It never worked in the standalone layout. - It is NVIDIA/MIT-licensed, unlike the Apache/Biohub model code. - Transformers expresses parallelism declaratively via `base_model_tp_plan` / `tp_plan="auto"`, not a vendored per-model DTensor/NCCL stack. Nothing unique is lost: the math it shards already exists as the pure-PyTorch reference in modeling_esmfold2_common.py. If multi-GPU inference is needed later, author a tp_plan on ESMFold2Model fresh. Verified: nothing references distributed/; `import transformers` and the esmfold2 modeling module still import cleanly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Author
Parents
Loading