Megatron-DeepSpeed
deploy elastic error handler
#258
Merged

deploy elastic error handler #258

stas00 merged 1 commit into main from torch-dist-record
stas00
stas00 deploy elastic error handler
1b0894bb
stas00 stas00 merged 8673d464 into main 4 years ago
stas00 stas00 deleted the torch-dist-record branch 4 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone