Megatron-DeepSpeed
Support skip iteration flag
#177
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
42
Changes
View On
GitHub
Support skip iteration flag
#177
stas00
merged 42 commits into
main
from
skip-iterations
feature: support skip iteration flag
cde7a559
fix: robust input check for skip ranges
5a8015c1
feature: fast forward megatron train loop
453b806a
test: add basic test for skip iteration
4a1ef27f
jaketae
marked this pull request as ready for review
4 years ago
stas00
commented on 2021-11-04
stas00
commented on 2021-11-04
stas00
commented on 2021-11-04
Update megatron/training.py
7ae82e2f
fix: merge overlapping intervals
e473f4d7
fix: flush irrelevant intervals, fix boundary condition
460e6eb0
stas00
commented on 2021-11-05
Update megatron/training.py
0f01330e
feature: log on rank 0
0ff51ea3
fix: use f-string
356bb579
fix: iteration is incremented first, then logged
4ac84b4e
test: add checks using stdout
a92b7ab2
stas00
commented on 2021-11-05
stas00
commented on 2021-11-06
jaketae
commented on 2021-11-06
jaketae
commented on 2021-11-06
Update tests/test_training.py
d29e6892
Update tests/test_training.py
c657084b
refactor: use loop to simplify asserts
8947cc6c
fix: end will be larger than last end
27329847
test: add checks on consumed tokens
f589eafc
test: use parametrized variations
a1164c3c
test: simplify skip iter test to base, cl
d9aaa0b1
Merge remote-tracking branch 'origin/main' into skip-iterations
48dbe64d
Trigger CI
2eb2a66b
2x instances
87116b34
2x instances
15507835
test: hard code num_gpus to 2
0b56230f
test: change test name to zskip
10251c38
test: revert back to `get_gpu_count()`
205d8685
test: run only skip test
7900da52
test: remove skip iter test
0bbe4044
fix: account for other ranks
fc811084
rework the test to do the right thing for cl
4b7de290
Merge remote-tracking branch 'origin/main' into skip-iterations
ded71f4e
undo debug
4fee00d7
wip
e1c23d0c
success
9a649fa9
jaketae
commented on 2021-11-11
Update megatron/arguments.py
d1caab31
conglongli
commented on 2021-11-11
stas00
commented on 2021-11-14
fix: update flag name
4abcd8c4
chore: backport commit 7a0158e
1a70624c
chore: simplify test
989e2c6c
Trigger CI
f1e9283a
Trigger CI
bb29ae97
small tweaks
87ef7991
Trigger CI
4e0581a0
stas00
merged
106a9a6f
into main
4 years ago
stas00
deleted the skip-iterations branch
4 years ago
Login to write a write a comment.
Login via GitHub
Reviewers
stas00
conglongli
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub