Fix deepspeed prefix-lm #107
Fix pretrain prefix lm using deepspeed
a4e131cf
Fix: self._args to args
5e8299cb
First set attn_mask in model and then build model
b0c5f10d
Fix: enforce that we pass down tuple instead of generator
f396c696
thomasw21
changed the title Fix deepspeed prefix Fix deepspeed prefix-lm 4 years ago
Attention mask does not need to be transposed
efcd4972
thomasw21
changed the title Fix deepspeed prefix-lm WIP: Fix deepspeed prefix-lm 4 years ago
thomasw21
marked this pull request as draft 4 years ago
BIGGEST HACK EVER
43f56beb
Remove BIGGEST HACK
62fe4477
Skip prefix test as PP>1 doesn't work yet on deepspeed
e65852f9
thomasw21
changed the title WIP: Fix deepspeed prefix-lm Fix deepspeed prefix-lm 4 years ago
thomasw21
marked this pull request as ready for review 4 years ago
stas00
approved these changes
on 2021-10-06
Unskip prefix test
34df6f6f
Merge branch 'main' into thomas/fix_deepspeed_prefix
2ae4ba61
thomasw21
changed the base branch from
main
to
master
4 years ago
thomasw21
changed the base branch from
master
to
main
4 years ago
Merge remote-tracking branch 'origin/main' into thomas/fix_deepspeed_…
9182bee6
thomasw21
merged
da31db64
into main 4 years ago
thomasw21
deleted the thomas/fix_deepspeed_prefix branch 3 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub