Megatron-DeepSpeed
Preprocessing from arrow file to load an HF dataset
#264
Open

Preprocessing from arrow file to load an HF dataset #264

TevenLeScao wants to merge 6 commits into main from preprocess_from_HF_dataset
TevenLeScao
TevenLeScao preprocessing from arrow file to load an HF dataset
20f582fe
TevenLeScao TevenLeScao requested a review from thomasw21 thomasw21 3 years ago
TevenLeScao TevenLeScao requested a review from stas00 stas00 3 years ago
TevenLeScao making sure iterator actually advances
f30f97cb
TevenLeScao fixed confusion between row format and column format
c868bc6c
TevenLeScao
TevenLeScao Adding faster hack version
505e8cac
TevenLeScao Adding faster hack version
ea9051e0
TevenLeScao added dirty hack to have a maximum size
cb7648b0

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone