bigscience-workshop/Megatron-DeepSpeed

Pull Requests Commits

refactor: use set for constant time lookup

jaketae committed 4 years ago

ac3e138b

refactor: replace filter w/ list comp, generator to list

jaketae committed 4 years ago

a7b10b7c

fix: use deepspeed param count method

jaketae committed 4 years ago

f4c7c67e

Update megatron/training.py

stas00 committed 4 years ago

Verified c2d63903

Update megatron/training.py

jaketae committed 4 years ago

Verified 816b8670

refactor: compute model param count once

Jake Tae committed 4 years ago

544108dc

param size printing revamp (#202)

stas00 committed 4 years ago

Verified fd1e1da9

[WIP] dealing with multi-process noise (#193)

stas00 committed 4 years ago

Verified 5d3150c7

elastic launcher compatible init_process_group (#201)

stas00 committed 4 years ago

Verified df5a5c4e

add missing space (#200)

SaulLu committed 4 years ago

Verified 3487f315

[CI] fix which tests get run (#199)

stas00 committed 4 years ago

Verified e9558293

Delete unnecessary brackets (#197)

SaulLu committed 4 years ago

Verified aa7fa931

Add eval-only arg (#188)

SaulLu committed 4 years ago

Verified 05851eb5

removed regular package for megatron model (#192)

stas00 committed 4 years ago

Verified 4989bcde

add layernorm in Embedding (#191)

stas00 committed 4 years ago

Verified b1359ee0

Support skip iteration flag (#177)

jaketae committed 4 years ago

Verified 106a9a6f

Full seqlen eval for CL+PP (#187)

conglongli committed 4 years ago

Verified 2105ff38

[BNB] integrate `StableEmbeding` into `VocabParallelEmbedding` logic (#182)

stas00 committed 4 years ago

Verified a34ca7f2

[PrefixLM] Figuring out why prefix lm is doing poorly on short context (#169)

thomasw21 committed 4 years ago

Verified 6d146b5f

[CI] improvements (#185)

stas00 committed 4 years ago

Verified 590f3e27

Alternative fix to TP > 1 (#178)

thomasw21 committed 4 years ago

Verified 2d9744f2

Fixed TP > 1 issue with new validation scheme

TevenLeScao committed 4 years ago

b982e040

simplifying tests

TevenLeScao committed 4 years ago

b1fc4927

Fixed merge oversight in tensorboard logs

TevenLeScao committed 4 years ago

18b704e1

Adding language specific validation sets for Multilingual model training (#97)

hadyelsahar committed 4 years ago

Verified 846c0879

Fix prefix lm offsets (#167)

thomasw21 committed 4 years ago

Verified 5e1f2101

Update main.yml (#172)

stas00 committed 4 years ago

Verified 5df29b5d

[CI] fix ci / update packages (#170)

stas00 committed 4 years ago

Verified b39dd1cf

[checkpoint] only one latest file (#164)

stas00 committed 4 years ago

Verified 45dcd528

Fix curriculum learning doc (#162)

conglongli committed 4 years ago

Verified 07533d03

Older