Wait for main process in _save_checkpoint to ensure best checkpoint exists #40923
ssharpe42
force pushed
from
b226cb9b
to
f140bf80
117 days ago
Update trainer.py
0d334704
fix
b5dbbbe6
fix format
7ec0bbe5
move barrier, delete redundant
ba9bffc1
ssharpe42
force pushed
from
ca1a80d0
to
ba9bffc1
115 days ago
SunMarc
approved these changes
on 2025-09-19
Merge branch 'main' into fix-best-ckpt-wait
3a1faa21
SunMarc
enabled auto-merge (squash) 114 days ago
Merge branch 'main' into fix-best-ckpt-wait
04a26a59
SunMarc
merged
d9739778
into main 104 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub