unstructured
refactor: unstructured ingest as a pipeline
#1551
Merged

refactor: unstructured ingest as a pipeline #1551

ryannikolaidis merged 52 commits into main from roman/ingest-pipeline
rbiseck3
rbiseck3 rbiseck3 requested a review from ryannikolaidis ryannikolaidis 2 years ago
rbiseck3 rbiseck3 requested a review from ahmetmeleq ahmetmeleq 2 years ago
rbiseck3 rbiseck3 force pushed from b780f476 to 37813cee 2 years ago
rbiseck3 rbiseck3 marked this pull request as ready for review 2 years ago
rbiseck3 rbiseck3 force pushed from 37813cee to 2df5fd30 2 years ago
rbiseck3 rbiseck3 requested a review from badGarnet badGarnet 2 years ago
rbiseck3 rbiseck3 added enhancement
rbiseck3 rbiseck3 added ingest
badGarnet
badGarnet commented on 2023-10-02
rbiseck3
ryannikolaidis
ryannikolaidis
ryannikolaidis
rbiseck3
ryannikolaidis
rbiseck3 rbiseck3 force pushed from d8592aee to 73619b40 2 years ago
rbiseck3 rbiseck3 force pushed from b513dc97 to 1a998b68 2 years ago
rbiseck3 rbiseck3 force pushed from f39a19c2 to 6fa71e19 2 years ago
rbiseck3 rbiseck3 force pushed from 5923c932 to 00090215 2 years ago
rbiseck3 rbiseck3 force pushed from 247958dd to 2474bd27 2 years ago
ryannikolaidis
ryannikolaidis commented on 2023-10-06
ryannikolaidis
ryannikolaidis
ryannikolaidis commented on 2023-10-06
ryannikolaidis
ryannikolaidis commented on 2023-10-06
ryannikolaidis
ryannikolaidis commented on 2023-10-06
ryannikolaidis
ryannikolaidis commented on 2023-10-06
ryannikolaidis
ryannikolaidis commented on 2023-10-06
ryannikolaidis
ryannikolaidis commented on 2023-10-06
ryannikolaidis
ryannikolaidis commented on 2023-10-06
ryannikolaidis
ryannikolaidis commented on 2023-10-06
ryannikolaidis
ryannikolaidis commented on 2023-10-06
ryannikolaidis
ryannikolaidis commented on 2023-10-06
ryannikolaidis
ryannikolaidis commented on 2023-10-06
ryannikolaidis
ryannikolaidis commented on 2023-10-06
ryannikolaidis
ryannikolaidis commented on 2023-10-06
rbiseck3 WIP: refactoring to support pipeline
758b43c9
rbiseck3 WIP: added properties to serialization of ingest docs
12d59c6a
rbiseck3 refactor pipeline approach
a07e4b13
rbiseck3 complete all steps of pipeline
9ae5d3f0
rbiseck3 Add step to copy to final destination
e60a0054
rbiseck3 fix how hashing occurs to allow reproducability
e8c1086e
rbiseck3 Update sharepoint and s3 to use pipeline
a29f5fca
rbiseck3 Update airtable to use pipeline
b63f6846
rbiseck3 Update azure to use pipeline
bb86fef1
rbiseck3 Update biomed to use pipeline
8ca2d199
rbiseck3 Update box to use pipeline
018de959
rbiseck3 Update confluence to use pipeline
9fe0a061
rbiseck3 Update delta table to use pipeline
3ae7dfb5
rbiseck3 Update discord to use pipeline
0e842802
rbiseck3 Update dropbox to use pipeline
3e8169d0
rbiseck3 Update elasticsearch to use pipeline
e5b8476d
rbiseck3 Update fsspec to use pipeline
1b38a44c
rbiseck3 Update gcs to use pipeline
fa5a979f
rbiseck3 Update github to use pipeline
7f1be8d4
rbiseck3 Update gitlab to use pipeline
f50008ec
rbiseck3 Update google drive to use pipeline
09a9c3dd
rbiseck3 Update jira to use pipeline
cd154c44
rbiseck3 Update local to use pipeline
30e5de99
rbiseck3 Update notion to use pipeline
67baf047
rbiseck3 Update onedrive to use pipeline
330ba2da
rbiseck3 Update outlook to use pipeline
1c2203ed
rbiseck3 Update reddit to use pipeline
aeb5635e
rbiseck3 Update salesforce to use pipeline
8f151a79
rbiseck3 Update slack to use pipeline
64ee676c
rbiseck3 Update wikipedia to use pipeline
fa1412bd
rbiseck3 lint fixes
29e56b4b
rbiseck3 Drop unit test of file that was removed
40572ab3
rbiseck3 Update Changelog
20c86db5
rbiseck3 Update tests to use explicit work dir
24b9fa23
rbiseck3 Fix unit tests
ccdc1e22
rbiseck3 Fix unit tests
21472aab
rbiseck3 run pip-compile
a47ca6a4
rbiseck3 run pip-compile
0f03e6c7
rbiseck3 pin version of onnxruntime
2f37a806
rbiseck3 Fix partition to run process_file
f9b684dd
rbiseck3 fix getting source metadata in fsspec connector
b3eec1d9
rbiseck3 Fix unit tests
b2b36da1
rbiseck3 fix delta table dest connector
59dc01aa
rbiseck3 fix delta s3 dest connector
62bd99ad
rbiseck3 fix getting source metadata in salesforce connector
f7d87965
rbiseck3 improve getting record info in salesforce connector
d902a924
rbiseck3 Testing ingest tests fix
9ae6d7a5
rbiseck3 revert deps
448cf351
rbiseck3 fix PR comments
4bb636b0
rbiseck3 rbiseck3 force pushed from c9f9c312 to 4bb636b0 2 years ago
rbiseck3 fix getting source metadata lazily
53d4453c
rbiseck3 fix linting
097cb243
rbiseck3 Rename all runner clases to include Runner in name
078d35a9
rbiseck3
ryannikolaidis
ryannikolaidis
ryannikolaidis approved these changes on 2023-10-06
ryannikolaidis ryannikolaidis changed the title roman/refactor ingest as pipeline refactor: unstructured ingest as a pipeline 2 years ago
ryannikolaidis ryannikolaidis enabled auto-merge 2 years ago
ryannikolaidis ryannikolaidis merged 2e1404e0 into main 2 years ago
ryannikolaidis ryannikolaidis deleted the roman/ingest-pipeline branch 2 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone