Add exception check in training_runner when worker runs into error, and misc check on nccl and mpi calls #4380
wschin
dismissed these changes
on 2020-07-21
error check
13648648
fix build warning treated as error
a524fec0
xzhu1900
dismissed their stale review
via a524fec0
5 years ago
xzhu1900
force pushed
from
4c625fcf
to
a524fec0
5 years ago
wschin
approved these changes
on 2020-07-22
xzhu1900
merged
e2acb165
into master 5 years ago
xzhu1900
deleted the xuzhu/check branch 5 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub