Add pre-tokenized Delta to MDS conversion script #1680
delta to mds script v1
ed15c039
remove open folder
5379d5b1
debug
48d26e4e
added intermediate jsonl
aa9edbfd
update script
fd54b595
cast to ndarray
2095115b
nit
6a75da57
revert delta->jsonl refactor
a1a5274d
nit
02dfcb56
update col name
4cf6d235
use dtypes
2932a9b6
Merge remote-tracking branch 'origin' into matt/split-mds-script
23635c4d
dbugging message
04e628eb
test bugfix
08bc526e
logic is hard
21abadaf
more testing
819c1126
Merge remote-tracking branch 'origin/main' into matt/split-mds-script
accb12b4
remove debug msg
19bf0a49
assume single turn input
b5bf28cf
reuse convert_ft_dataset fn
9372f488
update for ft
408b96fe
fix split
f47cfab3
revert a few commits to not break
46fc2d0b
rename file to train.jsonl
bb757c7e
add debugging statement
6c3e0a7b
change debugging statement
2712e2b0
Merge branch 'main' into matt/split-mds-script
bdcd3c0e
remove debugging statements
ef37a3f4
mattyding
force pushed
from
750a241c
to
ef37a3f4
1 year ago
remove diff
919aacd0
mattyding
marked this pull request as ready for review 1 year ago
mattyding
force pushed
from
cf378e02
to
919aacd0
1 year ago
re-add fix for multi-turn
7a2aaaab
bump
654d084f
Merge branch 'main' into matt/split-mds-script
5b918c42
update naming
38652da7
debug
f8640ea0
debug
92c58861
convert ndarray to bytes
512a165c
debug
f47093f8
first try specifying bytes type
d10e89e8
attempt
d87886ec
mattyding
marked this pull request as draft 362 days ago
print out dtype for debugging
2d7dbde1
do the thing
01c953b7
fix
084a2e48
i love debugging
dda4ec7a
fix
5120abdc
fix
ee8e274c
fix
0ae048cb
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub