Megatron-DeepSpeed
update merge_preprocessed_data to use distributed merge
#82
Merged

update merge_preprocessed_data to use distributed merge #82

adammoody
adammoody update merge_preprocessed_data to use parallel merge
cb5e674b
adammoody adammoody changed the title WIP: update merge_preprocessed_data to use parallel merge update merge_preprocessed_data to use parallel merge 4 years ago
adammoody
thomasw21
thomasw21 commented on 2021-09-15
adammoody
adammoody indexed_dataset: add docstrings to merge and gather methods
ab81641b
adammoody merge_preprocessed_data: tweak interface, add documentation
ff37f443
adammoody Merge branch 'main' into mergescript
9babb815
adammoody merge: improvements after testing
cde15ea6
adammoody tests: serial and distributed merge
cdb74778
adammoody
adammoody avoid setting pythonpath within script
549b3fe9
adammoody merge script: fix typo in usage comments
942fcdad
adammoody
thomasw21
thomasw21 approved these changes on 2021-09-21
adammoody print default backend when not set in distributed merge
ff63c320
adammoody adammoody changed the title update merge_preprocessed_data to use parallel merge update merge_preprocessed_data to use distributed merge 4 years ago
thomasw21 thomasw21 merged 0c820648 into main 4 years ago
thomasw21
adammoody
huu4ontocord
adammoody
huu4ontocord

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone