Remote JSONL IFT data #275
support remote jsonl files for IFT datasets
3f318634
improve docstring
be630c03
add support for other extensions
eded4dfa
don't duplicate validation check
e82ed189
build dataset before tmpdir deletes
4a93eaaa
parse uri
92a14803
samhavens
marked this pull request as ready for review 2 years ago
only rank 0 download
f6380ecf
only download rank 0
8173998a
dakinggg
approved these changes
on 2023-06-06
better error
0f31e335
break earlier
c3676ff0
log more
67fc6151
more reasonable destination str
171aadb6
use data files format
4d420677
name points to a preprocessing function I guess
455c88a7
debugging
85dcf9bb
always something with HF
edabf535
json vs jsonl [no-ci]
93c50d1c
if hf wants it local, make it local [no-ci]
8b95fae7
back to tempfile [no-ci]
11204e1e
debug
3b9b85aa
debug hfds [no-ci]
b5158ebe
... [no-ci]
a6be0623
don't rename file
ee9f402a
use tempfile again
246f6ff7
Merge branch 'main' into remote-jsonl-ift
1211eae3
Merge branch 'main' into remote-jsonl-ift
bcf5ed62
merge main and cleanup
f39209d1
vchiley
merged
af209b38
into main 2 years ago
samhavens
deleted the remote-jsonl-ift branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub