Megatron-DeepSpeed
Floating-point ops counting and reloading
#40
Merged

Floating-point ops counting and reloading #40

TevenLeScao merged 24 commits into main from training_flos
TevenLeScao
TevenLeScao initial flo count/logging setup (need to fix model parameter count)
5196f8a5
TevenLeScao initial flo count/logging setup (need to fix model parameter count)
a3a4ba46
TevenLeScao 1B3 parameter setup + flos counting
bdd75f18
TevenLeScao 1B3 parameter setup + flos counting
c0fc29a6
TevenLeScao 1B3 parameter setup + flos counting
aefbe3bd
TevenLeScao 1B3 parameter setup
17e01842
TevenLeScao 1B3 parameter setup
97dd06db
TevenLeScao synched with latest 13B script
64892e27
TevenLeScao synched with latest 13B script
3c79aaca
pipe transformer docstring
b7b3167f
improve DS integration evaluation + logging
8382141c
jeffra use pp engine even for pp=1 (#6)
06cb18fa
removed slurm_examples
d5818947
TevenLeScao flos re-loading
60794bf7
TevenLeScao TevenLeScao requested a review from ibeltagy ibeltagy 4 years ago
TevenLeScao TevenLeScao requested a review from stas00 stas00 4 years ago
TevenLeScao TevenLeScao requested a review from thomasw21 thomasw21 4 years ago
TevenLeScao
TevenLeScao commented on 2021-08-04
TevenLeScao
TevenLeScao commented on 2021-08-04
TevenLeScao
TevenLeScao
TevenLeScao commented on 2021-08-04
TevenLeScao
TevenLeScao commented on 2021-08-04
stas00
stas00 approved these changes on 2021-08-04
thomasw21
thomasw21 commented on 2021-08-04
stas00
stas00
TevenLeScao Merge branch 'main' into training_flos
c79db1cd
TevenLeScao Update megatron/training.py
fb33f138
TevenLeScao Update megatron/data/gpt_dataset.py
dff1479b
stas00
stas00 commented on 2021-08-24
TevenLeScao Update megatron/utils.py
2fa3b5b8
TevenLeScao Update megatron/utils.py
ff7af108
TevenLeScao formatting fix, reserving bug for somewhere else, adding flo-logging …
b9ac381f
TevenLeScao indentation bug
f25e25f5
TevenLeScao fixing possible double counts
e63503dd
stas00
stas00 commented on 2021-08-25
stas00
stas00 commented on 2021-08-25
TevenLeScao tweaks
5bdcf819
TevenLeScao
stas00
TevenLeScao
stas00
TevenLeScao
stas00
TevenLeScao warning for double counts
72ad7113
TevenLeScao
TevenLeScao TevenLeScao merged af8229e2 into main 4 years ago
thomasw21
stas00

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone