Megatron-DeepSpeed
Group tensorboard metrics
#39
Merged

Group tensorboard metrics #39

VictorSanh merged 12 commits into main from tensorboard
VictorSanh
VictorSanh Training groupings
bbafd50a
VictorSanh validation grouping
c68cad42
VictorSanh steps vs samples
0447b630
VictorSanh iteration time (speed -> samples or iterations per second)
f14e7728
VictorSanh tensorboard group time (from `log_timers_to_tensorboard`)
cfb02c7a
VictorSanh comment on the writing condition
97101125
stas00
stas00 approved these changes on 2021-08-04
VictorSanh Update megatron/global_vars.py
f828e79c
VictorSanh Update megatron/training.py
dda74585
VictorSanh Update megatron/training.py
73b7c61d
VictorSanh Update megatron/training.py
c56c704a
VictorSanh Update megatron/training.py
1dcf28ad
stas00
VictorSanh
VictorSanh link bug fix issue on megatron-lm side
7a43335c
VictorSanh
VictorSanh VictorSanh merged 9e75429d into main 4 years ago
stas00
VictorSanh
stas00

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone