Faster preprocessing (#18)
* Propose a faster preprocessing mechanim by reducing the interprocesses communications
* Add flush in order to force print
* Try to prevent dead locks
* Woops
* Trying to figure out what causes deadlock
* Limit queue size to 1_000_000
* Drastically reduce the maximum number of element in the queue
* Threading does not use a worker
* Remove shard files and factorise shard naming
* Document high number of worker preprocessing script
* Improve naming
* Update comments and readmes
* Woops
* Remove the notion of vanilla and point to the script instead
* Rephrase readme to use around 60 cores instead of 40
Co-authored-by: Thomas <รถ95242+thomasw21@users.noreply.github.com>