llm-foundry
Add pre-tokenized Delta to MDS conversion script
#1680
Open

Add pre-tokenized Delta to MDS conversion script #1680

mattyding wants to merge 46 commits into main from matt/split-mds-script
mattyding
mattyding delta to mds script v1
ed15c039
mattyding remove open folder
5379d5b1
mattyding debug
48d26e4e
mattyding added intermediate jsonl
aa9edbfd
mattyding update script
fd54b595
mattyding cast to ndarray
2095115b
mattyding nit
6a75da57
mattyding revert delta->jsonl refactor
a1a5274d
mattyding nit
02dfcb56
mattyding update col name
4cf6d235
mattyding use dtypes
2932a9b6
mattyding Merge remote-tracking branch 'origin' into matt/split-mds-script
23635c4d
mattyding
mattyding commented on 2024-11-30
mattyding dbugging message
04e628eb
mattyding test bugfix
08bc526e
mattyding logic is hard
21abadaf
mattyding more testing
819c1126
mattyding Merge remote-tracking branch 'origin/main' into matt/split-mds-script
accb12b4
mattyding remove debug msg
19bf0a49
mattyding assume single turn input
b5bf28cf
mattyding reuse convert_ft_dataset fn
9372f488
mattyding update for ft
408b96fe
mattyding fix split
f47cfab3
mattyding revert a few commits to not break
46fc2d0b
mattyding rename file to train.jsonl
bb757c7e
mattyding add debugging statement
6c3e0a7b
mattyding change debugging statement
2712e2b0
mattyding Merge branch 'main' into matt/split-mds-script
bdcd3c0e
mattyding remove debugging statements
ef37a3f4
mattyding mattyding force pushed from 750a241c to ef37a3f4 1 year ago
mattyding remove diff
919aacd0
mattyding mattyding marked this pull request as ready for review 1 year ago
mattyding mattyding requested a review 1 year ago
dakinggg
dakinggg commented on 2025-03-05
mattyding
mattyding mattyding requested a review from dakinggg dakinggg 1 year ago
dakinggg
dakinggg commented on 2025-03-17
mattyding mattyding force pushed from cf378e02 to 919aacd0 1 year ago
mattyding re-add fix for multi-turn
7a2aaaab
mattyding bump
654d084f
mattyding Merge branch 'main' into matt/split-mds-script
5b918c42
mattyding update naming
38652da7
mattyding debug
f8640ea0
mattyding debug
92c58861
mattyding convert ndarray to bytes
512a165c
mattyding debug
f47093f8
mattyding first try specifying bytes type
d10e89e8
mattyding attempt
d87886ec
mattyding mattyding marked this pull request as draft 362 days ago
mattyding print out dtype for debugging
2d7dbde1
mattyding do the thing
01c953b7
mattyding fix
084a2e48
mattyding i love debugging
dda4ec7a
mattyding fix
5120abdc
mattyding fix
ee8e274c
mattyding fix
0ae048cb

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone