unstructured
fd293b3e - feat: add elasticsearch destination connector (#2152)

Commit
2 years ago
feat: add elasticsearch destination connector (#2152) Closes https://github.com/Unstructured-IO/unstructured/issues/1842 Closes https://github.com/Unstructured-IO/unstructured/issues/2202 Closes https://github.com/Unstructured-IO/unstructured/issues/2203 This PR: - Adds Elasticsearch destination connector to be able to ingest documents from any supported source, embed them and write the embeddings / documents into Elasticsearch. - Defines an example unstructured elements schema for users to be able to setup their unstructured elasticsearch indexes easily. - Includes parallelized upload and lazy processing for elasticsearch destination connector. - Rearranges elasticsearch test helpers to source, destination, and common folders. - Adds util functions to be able to batch iterables in a lazy way for uploads - Fixes a bug where removing the optional parameter `--fields` broke the connector due to an integer processing error. - Fixes a bug where using an [elasticsearch config](https://github.com/Unstructured-IO/unstructured/blob/8fa5cbf036c4b6a29a8e6c0cd81f22ef3ae84ed1/unstructured/ingest/connector/elasticsearch.py#L26-L35) for a destination connector resulted in a serialization issue when optional parameter `--fields` was not provided.
Author
Parents
Loading