DeepSpeed
Write multiple items to output file at once, in distributed data analyzer.
#5169
Merged

Write multiple items to output file at once, in distributed data analyzer. #5169

conglongli merged 45 commits into deepspeedai:master from write_multiple_items_at_once_in_distributed_data_analyzer
bm-synth
bm-synth added assert of torch vs numpy types
14f2bbe1
bm-synth first draft
796341d9
bm-synth reverted to original master
07aa4b42
bm-synth added metric type accumulate_value_over_samples
815a7897
bm-synth pre-commit
28a72e7c
bm-synth Merge branch 'master' into distributed_data_analyzer
e8dbf0b3
bm-synth Merge branch 'distributed_data_analyzer' of github.com:bm-synth/DeepS…
ec3479fa
bm-synth Update data_analyzer.py
38d7ce66
bm-synth added check for single node reduce. added barriers
295fba67
bm-synth more bug fixes
4144e427
bm-synth new iteration, many bug fixes
a1e121c9
bm-synth bug fixes
e045753c
bm-synth Merge branch 'master' into distributed_data_analyzer
3a891162
bm-synth fixing previous commit
cdc838c1
bm-synth Merge branch 'master' into distributed_data_analyzer
ba34a550
bm-synth pre-commit
5c077104
bm-synth Merge branch 'distributed_data_analyzer' of github.com:bm-synth/DeepS…
87d76867
bm-synth write sequentially to file
a634787f
bm-synth Merge branch 'master' into distributed_data_analyzer
848ffd5d
bm-synth fixes in sequential write
ec59f08d
bm-synth Merge branch 'distributed_data_analyzer' of github.com:bm-synth/DeepS…
832874c2
bm-synth pre-commit hooks
ea0d65f5
bm-synth Merge branch 'master' into distributed_data_analyzer
c6c9bc5b
bm-synth added main as example
56a95338
bm-synth Merge branch 'distributed_data_analyzer' of github.com:bm-synth/DeepS…
b4d86543
bm-synth Merge branch 'master' into distributed_data_analyzer
676dc1a3
bm-synth Update data_analyzer.py
6788af55
bm-synth first working version. idx files differ
bd61d9c2
bm-synth Merge branch 'distributed_data_analyzer' of github.com:bm-synth/DeepS…
7ac5e45c
bm-synth added missing static function
8bf0e635
bm-synth removed/added breaklines to match base code
e5a7eb0f
bm-synth corrected comment
3b8014fd
bm-synth imports
5a426879
bm-synth removed main
cdaad362
bm-synth reverted main
b3d40620
bm-synth bug fix in sample calculation
7cabfa2a
bm-synth added worker_an and num_worker to kwargs
62f68dd1
bm-synth removed dist.initialize ()from DataAnalyzer.run_map_reduce
6d35e454
bm-synth first iteration
be91d37c
bm-synth updated with add_items
5fd05468
bm-synth added add_items
f5be5e1b
bm-synth Merge branch 'master' into write_multiple_items_at_once_in_distribute…
1ccd3bad
bm-synth Update indexed_dataset.py
9d5c1715
bm-synth bm-synth marked this pull request as ready for review 2 years ago
bm-synth bm-synth requested a review from conglongli conglongli 2 years ago
bm-synth bm-synth changed the title Write multiple items to file at once in distributed data analyzer Write multiple items to output file at once, in `DistributedDataAnalyzer`. 2 years ago
bm-synth bm-synth changed the title Write multiple items to output file at once, in `DistributedDataAnalyzer`. Write multiple items to output file at once, in distributed data analyzer. 2 years ago
conglongli Merge branch 'master' into write_multiple_items_at_once_in_distribute…
dc1dbb30
conglongli
conglongli approved these changes on 2024-02-21
conglongli conglongli assigned conglongli conglongli 2 years ago
bm-synth formatting
db29942f
conglongli conglongli enabled auto-merge 2 years ago
conglongli conglongli merged d5fa87ff into master 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
Labels
Milestone