Megatron-DeepSpeed
7b998814 - Faster preprocessing (#18)

Commit
4 years ago
Faster preprocessing (#18) * Propose a faster preprocessing mechanim by reducing the interprocesses communications * Add flush in order to force print * Try to prevent dead locks * Woops * Trying to figure out what causes deadlock * Limit queue size to 1_000_000 * Drastically reduce the maximum number of element in the queue * Threading does not use a worker * Remove shard files and factorise shard naming * Document high number of worker preprocessing script * Improve naming * Update comments and readmes * Woops * Remove the notion of vanilla and point to the script instead * Rephrase readme to use around 60 cores instead of 40 Co-authored-by: Thomas <รถ95242+thomasw21@users.noreply.github.com>
Author
Parents
Loading