DeepSpeed
9c696620 - Fix issue with zero-sized file after merging file on curriculum `map_reduce` (#5106)

Commit
1 year ago
Fix issue with zero-sized file after merging file on curriculum `map_reduce` (#5106) In `deepspeed/runtime/data_pipeline/data_sampling/indexed_dataset.py` when calling `merge_file_` , the following operation may not flush the merged file in time, before it's needed: ``` # Concatenate data with open(data_file_path(another_file), 'rb') as f: shutil.copyfileobj(f, self._data_file) ``` this leads to `self._data_file` having size zero, and later to the following error (with stack trace): ``` File "~/my_code/deepspeed_trainer.py", line 999, in my_func data_analyzer.run_reduce() File "~/my_env/lib/python3.11/site-packages/deepspeed/runtime/data_pipeline/data_sampling/data_analyzer.py", line 413, in run_reduce self.merge_map_results(self.dataset, self.metric_names, self.metric_types, self.save_path, File "~/my_env/lib/python3.11/site-packages/deepspeed/runtime/data_pipeline/data_sampling/data_analyzer.py", line 371, in merge_map_results index_to_sample = MMapIndexedDataset(index_to_sample_fname, skip_warmup=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "~/my_env/lib/python3.11/site-packages/deepspeed/runtime/data_pipeline/data_sampling/indexed_dataset.py", line 486, in __init__ self._do_init(path, skip_warmup) File "~/my_env/lib/python3.11/site-packages/deepspeed/runtime/data_pipeline/data_sampling/indexed_dataset.py", line 502, in _do_init self._bin_buffer_mmap = np.memmap(data_file_path(self._path), mode='r', order='C') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "~/my_env/lib/python3.11/site-packages/numpy/core/memmap.py", line 268, in __new__ mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: cannot mmap an empty file ``` This PR fixes that issue by forcing the destination file to be flushed and adding an assert to make sure the concatenation succeeded. deepspeed version: '0.13.2' --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Author
Parents
Loading