datasets
Support DataLoader with num_workers > 0 in streaming mode
#4375

Merged

Commits

make TorchIterableDataset work in parallel

lhoestq committed 3 years ago
start writing some tests

lhoestq committed 3 years ago
Merge branch 'master' into parallel-torch-iterable-dataset

lhoestq committed 3 years ago
fix streaming extension and fsspec issues in subprocesses

lhoestq committed 3 years ago
fix some tests

lhoestq committed 3 years ago
fix more tests

lhoestq committed 3 years ago
Merge branch 'master' into parallel-torch-iterable-dataset

lhoestq committed 3 years ago
fix import

lhoestq committed 3 years ago
fix and add tests

lhoestq committed 3 years ago
fix patch (handle successive patches and builtins)

lhoestq committed 3 years ago
revert unnecessary change to enriched_web_blg

lhoestq committed 3 years ago
style

lhoestq committed 3 years ago
use open locally to fix win permission errors

lhoestq committed 3 years ago
keep file opened in read_csv

lhoestq committed 3 years ago
Merge branch 'master' into parallel-torch-iterable-dataset

lhoestq committed 3 years ago
fix compression for read_csv

lhoestq committed 3 years ago
consistency of read_csv: don't infer compression for file-like objects

lhoestq committed 3 years ago
stringify Path objects

lhoestq committed 3 years ago
comments + raise error if sharding is ambiguous

lhoestq committed 3 years ago
Merge branch 'master' into parallel-torch-iterable-dataset

lhoestq committed 3 years ago
minor

lhoestq committed 3 years ago
Merge branch 'master' into parallel-torch-iterable-dataset

lhoestq committed 3 years ago
Update src/datasets/iterable_dataset.py

lhoestq committed 3 years ago
Merge branch 'master' into parallel-torch-iterable-dataset

lhoestq committed 3 years ago

datasets Support DataLoader with num_workers > 0 in streaming mode #4375 Merged

datasets
Support DataLoader with num_workers > 0 in streaming mode
#4375

Merged