update merge_preprocessed_data to use distributed merge (#82)
* update merge_preprocessed_data to use parallel merge
* indexed_dataset: add docstrings to merge and gather methods
* merge_preprocessed_data: tweak interface, add documentation
* merge: improvements after testing
* tests: serial and distributed merge
* avoid setting pythonpath within script
* merge script: fix typo in usage comments
* print default backend when not set in distributed merge