feat: refactor ingest #3009
rbiseck3
force pushed
from
2c41d9b8
to
002824fc
1 year ago
rbiseck3
force pushed
from
002824fc
to
cc14e480
1 year ago
rbiseck3
changed the title feat: refactor ingest (WIP) feat: refactor ingest 1 year ago
rbiseck3
force pushed
from
4cfd6a14
to
aed69c3c
1 year ago
rbiseck3
force pushed
from
d040a369
to
58cdb613
1 year ago
rbiseck3
force pushed
from
58cdb613
to
253f1582
1 year ago
rbiseck3
force pushed
from
0f216ed1
to
90d8d30c
1 year ago
rbiseck3
force pushed
from
701bd89c
to
9c35a593
1 year ago
Create new interfaces to support more versatility in how ingest proce…
f48e6083
Begin flushing out pipeline
4945d099
Add partitioner pipelien step
20fd7d1a
Add chunker pipeline step
f8c18f3e
Add upload pipeline step
7a6b8e44
Support file level reprocess flag
7bc79dfe
Add local destination as default
4d3a5c65
Add support for uncompress via new pipeline step
3898b220
Move files around
0f2822af
Add s3 connector
c3e71132
Add cli commands
db3fe7e5
bring over more logic from original implementation
97c14b7e
dynamically add new commands into existing list, annotated with v2 as…
59cce075
fix fsspec inputs
0b1e72a6
print all errors at the end of pipeline
934acb89
Add optional limit on connections when using asyncio
88e75875
Add entry to changelog
b0201ab0
support python3.9
33bf0408
improve type checking in fsspec connectors
b6a44344
Add __future__ to top level __init__ for v2 code
e7739a94
Add better type checking in cli command code
e7203a26
update fsspec metadata to include record locator info
7bb91d92
Fix endpoint param in s3 fsspec connector
d823510b
Small optimization in getting acccess configs from s3 connector config
0dd164df
Add recursive flag to local cli inputs
5f128de7
Add checks when getting values from os.stat
dd2706de
Add a classmethod to generate pipeline from configs
0ad80ca2
Add dependency check wrapper for s3 connector
6951a3a2
Add new README in v2
95ea1cd8
Fix local connector
044c7589
Fix await in s3 connector
2489290f
feat: refactor ingest <- Ingest test fixtures update (#3048)
79856018
Improve typing
84a6ee07
expose max connections in CLI
c47a8a1c
Add sequence diagram
91039e12
remove print statement
acd32202
Don't pass unset partition kwargs
2ab79942
skip confluence
bd1315ea
feat: refactor ingest <- Ingest test fixtures update (#3059)
41361f49
Add back in confluence tests
5c7cfbba
fix s3 uploader
467a8878
fix s3 uploader
11312bae
Skip date created for minio as this will never be consistent
0d8b5b9d
tidy shell
811f3bf1
skip confluence
fef31348
feat: refactor ingest <- Ingest test fixtures update (#3060)
ff9ef036
Add back in confluence tests
d1ce6949
fix minio test
57ff33de
Update use of chunking strategy in CLI inputs
ff52ecef
fix chunk strategy cli param
eb092100
feat: refactor ingest <- Ingest test fixtures update (#3064)
aa537116
rbiseck3
force pushed
from
51038072
to
aa537116
1 year ago
Add back in elasticsearch_elements_mappings.json into the es scripts dir
57b6f2b2
Add back in elasticsearch_elements_mappings.json into the opensearch …
58e1f8ff
rbiseck3
merged
3eaf65a8
into main 1 year ago
rbiseck3
deleted the roman/refactor-ingest branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub