Megatron-DeepSpeed
Faster preprocessing
#18
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
15
Changes
View On
GitHub
Faster preprocessing
#18
thomasw21
merged 15 commits into
bigscience-workshop:main
from
thomasw21:faster_preprocessing
thomasw21
force pushed
from
f67b36ea
to
0e8287c0
4 years ago
thomasw21
commented on 2021-07-26
thomasw21
requested a review
from
stas00
4 years ago
thomasw21
commented on 2021-07-26
thomasw21
force pushed
from
b5c74a72
to
e8152b19
4 years ago
stas00
commented on 2021-07-26
Propose a faster preprocessing mechanim by reducing the interprocesseā¦
fac6e903
thomasw21
force pushed
from
e8152b19
to
fac6e903
4 years ago
Add flush in order to force print
42aeef38
Try to prevent dead locks
25c9090c
thomasw21
force pushed
from
0554e0b3
to
25c9090c
4 years ago
Woops
ce80823d
Trying to figure out what causes deadlock
80ef737e
Limit queue size to 1_000_000
a0f0b9a4
Drastically reduce the maximum number of element in the queue
bdacde26
Threading does not use a worker
f05ba9f1
Remove shard files and factorise shard naming
fed86e98
stas00
commented on 2021-08-02
Document high number of worker preprocessing script
d9736bde
stas00
commented on 2021-08-03
stas00
requested a review
from
stas00
4 years ago
stas00
approved these changes on 2021-08-03
Improve naming
7d53441a
Update comments and readmes
229159de
stas00
commented on 2021-08-04
Woops
8119113f
stas00
commented on 2021-08-04
stas00
commented on 2021-08-04
Remove the notion of vanilla and point to the script instead
43bf26ce
Rephrase readme to use around 60 cores instead of 40
6ccf7d06
thomasw21
merged
7b998814
into main
4 years ago
hyunwoongko
commented on 2021-08-22
Login to write a write a comment.
Login via GitHub
Reviewers
stas00
hyunwoongko
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub