pytorch-lightning
Move checkpoint for auto-requeue on SLURM clusters out of signal handlers
#21407
Open

Move checkpoint for auto-requeue on SLURM clusters out of signal handlers #21407

martenlienen wants to merge 2 commits into Lightning-AI:master from martenlienen:slurm-signals
martenlienen
martenlienen martenlienen requested a review from lantiga lantiga 41 days ago
martenlienen martenlienen requested a review from tchaton tchaton 41 days ago
martenlienen martenlienen requested a review from justusschock justusschock 41 days ago
martenlienen martenlienen requested a review from ethanwharris ethanwharris 41 days ago
github-actions github-actions added pl
martenlienen martenlienen force pushed from 6a100f7b to 7f19428f 41 days ago
martenlienen martenlienen force pushed from 7f19428f to ac50dbd0 41 days ago
martenlienen Move checkpointing out of signal handlers
d94f46cd
martenlienen Remove unused type-ignore comment
f0c2c47c
martenlienen martenlienen force pushed from ac50dbd0 to f0c2c47c 41 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone