Support DataLoader with num_workers > 0 in streaming mode #4375
make TorchIterableDataset work in parallel
58289fc3
start writing some tests
c26a9f11
Merge branch 'master' into parallel-torch-iterable-dataset
4c3ce960
fix streaming extension and fsspec issues in subprocesses
8c60fa30
fix some tests
6dc859e5
fix more tests
7056f1a0
Merge branch 'master' into parallel-torch-iterable-dataset
edef69b1
fix import
c0a0492e
fix and add tests
7043816a
fix patch (handle successive patches and builtins)
a9ea9559
revert unnecessary change to enriched_web_blg
07d4c0e4
style
af5de1ac
use open locally to fix win permission errors
b84ae0ea
keep file opened in read_csv
17467121
Merge branch 'master' into parallel-torch-iterable-dataset
bc837ce0
fix compression for read_csv
fe269bff
consistency of read_csv: don't infer compression for file-like objects
482c4fbe
stringify Path objects
54e9f39c
lhoestq
marked this pull request as ready for review 3 years ago
comments + raise error if sharding is ambiguous
8f5579eb
Merge branch 'master' into parallel-torch-iterable-dataset
ab91dbdf
minor
1b87fb3b
Merge branch 'master' into parallel-torch-iterable-dataset
b675a694
Update src/datasets/iterable_dataset.py
816d5912
Merge branch 'master' into parallel-torch-iterable-dataset
ff586c46
lhoestq
merged
ab7d3045
into master 3 years ago
lhoestq
deleted the parallel-torch-iterable-dataset branch 3 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub