Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
Unstructured-IO/unstructured
Pull Requests
Commits
fix/sftp-connection-fix
0.16.18-release
3051/image-to-py312
3105/office-image-fix
CI-3347
CLONE-CI-PR3108
CLONE-PR3108
CORE-1503-dont-use-partition
CORE-1558-Integrate-2-column-format-ordering-logic-into-unstructured
CORE-3587/better-element-ids|ingest-test-fixtures-update-b5e53bb
CORE-5030/gpt4o_ocr_adam_mix_openai_tess
CORE-5030/gpt4o_ocr_adam_v2
CORE-5030/gpt4o_ocr_adam
CORE-5030/gpt4o_ocr_individual_blocks
CORE-5030/tesseract_without_extractable_benchmark
CORe-1746/audio-partition-brick
ML-208/ML-236-evaluate-models
ML-593/quote-standardization
ML-1128/fix-element-ids
P6M-615-add-voyageai-embed-to-v2
acameron/update-readme
add-tesseract-confidence-threshold
add-time-regression
ahmet/qdrant-normalization
ahmet/split-dev-changelogs
ahmet/trials
ahmet/update
alan/document-level-sorting
alan/hires-extract-figure-overlay-text|ingest-test-fixtures-update-6049ab0
alan/ml-1328-category-depth-heading-level
alan/partition-benchmark-rolling-baseline
audio-video
benjamin/feat/clean-by-store-pdfminer-inner-elements
benjamin/feat/clean-pdfminer-inner-elements|ingest-test-fixtures-update-149514f
blocking-async-fixes
blore/fix/add-lang-param-to-google-ocr
bug/division-by-zero-pdf-partition
build/add-python-3.11-support-back
build(deps)/update-deps
build/deps-2024-07-29
build/use-python-cache
ci/ingest-test-fixtures/download-nltk-models|ingest-test-fixtures-update-e2b1a4f
codeflash/optimize-CustomPDFPageInterpreter._patch_current_chars_with_render_mode-mm3h21a8
codeflash/optimize-pr4098-2025-09-23T20.06.24
crag/arm64-friendly-reqs
crag/codex-parent-id-table-chunk
crag/fixtures-update|ingest-test-fixtures-update-9f57456
crag/image-test-connectivity
crag/linear-CI
crag/pdf-boom-issue|ingest-test-fixtures-update-7054e5d
crag/pr-4340-do-not-merge
crag/test-amd64-build-only
crag/test-docker-build
data-source-props
deps/security-bump
docs-updates
dummy-for-ingest-test-update|ingest-test-fixtures-update-182a095
dummy-for-ingest-test-update
early-page-check
exp/investigate-different-output-hi_res
feat/add-form-element-type
feat/bboxes-ordering
feat/databricks-volumes-src
feat/date-in-metadata|ingest-test-fixtures-update-94037c8
feat/gpt4o-ocr
feat/html-para-split
feat/load-into-spacy-notebook
feat/markdown-to-table-cells
feat/partition-fb2
feat/patch-pdfminer-to-expose-rendermode-for-ltchar
feat/pdf-remove-cid-check|ingest-test-fixtures-update-5fb5b4c
feat/pdf-remove-cid-check|ingest-test-fixtures-update-6750053
feat/replace-pdfminer-with-pdfplumber
feat/use-extracted-for-tables
feat/use-pdfium-for-extracted-layout
feat/2208-improve-reading-order|ingest-test-fixtures-update-932ad02
feature/weighted-metrics
fix/broken-perf-symlinks
fix/chunk-table-isolation
fix/chunking-should-not-group-table-with-other-elements
fix/crop-image-for-save-elements
fix/docker-publish
fix/docx-include-text-from-shapes
fix/invalid-evaluation-doctype-deduction
fix/nltk-download-order
fix/no-mesa-workaround
fix_password
fix/pdfminer-duplicated-text
fix/sftp-connection-fix
fix/unstructured-client-compat-fix
fix/update-chipper-ex-nb
fix/1057-etree-parser-error-xlsx|ingest-test-fixtures-update-1efdfcf
fix-render-mode-figure-revert
fix-table-metrics
fix-when-first-element-doesnt-have-parent-id
gh-pages
gh-readonly-queue/main/pr-1020-331c7faf38984d0ce29d920f6ac51d5071f6c0c5
gh-readonly-queue/main/pr-2079-f1ad901f5725f1c05a73203f47ea47878ba163af
guard-hi-res-pdf-render-pixels
hubert_upstream_unstructured_temp1
hubert_upstream_unstructured_temp2
include-upstream-unstructure-latest
ingest-test-fest
installing-codecov
jack/embed_table_reference
jiajun/ingest
jj/OSS-23/msg-and-email-metadata|ingest-test-fixtures-update-05fd4e6
jj/zh_adaptation|ingest-test-fixtures-update-f0a6755
jj/zh_adaptation
jj/1227-bbox-nan|ingest-test-fixtures-update-63f619c
jj/1520-rfctr-text|ingest-test-fixtures-update-c8541a2
jkm/reorient_pages
klaijan/add-test-coverage-metrics
klaijan/ci-cct-running-env
klaijan/get-eval-metrics-ingest-in|ingest-test-fixtures-update-b3b6b79
lang-detection
languages-param-3
luke/may-bump-python-reqs
main|ingest-test-fixtures-update-2bb463d
main|ingest-test-fixtures-update-b283962
main
marek/fix/text_as_html-metadata
marek/prefer_languages
mem/paddle-rec-batch-num
ms-dsproperties
newelh/metadata-refactor
nick/uploadv4
nina/add_openai_embed_parameters
od_table_extraction
old_version_repo
onedrive-additional-properties
partition-inline-ocr_only
pdf-plumber
pdftext-hires-investigation
pdftext-metrics-check
pluto/langchain-text-splitter-poc
potter/all-doc-types
potter/astradb-updates
potter/connectors-jupyter-notebook
potter/improve-v2-connector-docs
potter/test-codeflash
prevent-ValueError-with-chipper-extracted-elements
refactor/nltk-download
refactor-click-wrappers
refactor-click-wrappers-2|ingest-test-fixtures-update-b3cab42
rfctr/base-partitioner-class
robinson/winter-sports-example
roman/bugfix-missing-extension
rvztz/cleanup-source-metadata
rvztz/unit-test-conform-dict
ryan/bump-box-expected-files
ryan/ci-version-bump-only
ryan/cognitive-search-demo
ryan/cognitive-search-demo-2
ryan/fix-secrets-passed-to-ingest-tests|ingest-test-fixtures-update-c5e9ddc
ryan/fix-test-api-failure
ryan/ignore-notion-overwrite-fixtures
ryan/improved-salesforce-partitioning|ingest-test-fixtures-update-013fbc4
ryan/improved-salesforce-partitioning
ryan/ingest-add-missing-chunk-params|ingest-test-fixtures-update-2724c82
ryan/ingest-add-missing-chunk-params
ryan/ingest-end-to-end-demo
ryan/investigate-hang
ryan/investigate-hang-rollback-ingest
ryan/less-connector-steps
ryan/more-less-connector-steps
ryan/refactor-ingest-tests|ingest-test-fixtures-update-c175ae5
ryan/roll-back-inference-0.5.27
ryan/session-by-config
ryan/session-per-config
ryan/skip-tests-on-changelog-or-version
ryan/test-ci-failures
ryan/test-empty-commit
ryan/test-es-issue
ryan/test-new-creds
ryan/test-single-file-ingest|ingest-test-fixtures-update-58e988e
ryan/test-single-file-ingest
ryan/test-update-fixtures|ingest-test-fixtures-update-f2060ec
ryan/test-update-fixtures
ryan/validate-new-gh-pr-token-key|ingest-test-fixtures-update-cadf764
ryan/validate-new-gh-pr-token-key
ryan/validate-new-gh-pr-token-key-01|ingest-test-fixtures-update-023071b
ryan/validate-new-gh-pr-token-key-01
scanny/spike-relax-tbl-segregation-chunking
sdfasdfasdf
sebastian/draw_bboxes
sms7234-patch-1
temp-metrics-check
tesseract-second-version
test/segregate-long-running-integration-tests
test/speed-up-chipper-schema-test
test_vertical_pred
trevor/alias-python
trevor/az-login
trevor/azure-cli-test
trevor/azure-gcp-auth-workflow
trevor/base-image-refresh-9
trevor/cancel-concurrent-workflows
trevor/detect-arch-paddle
trevor/gcp-login
trevor/image-load-issue-amd64
trevor/large-runners
trevor/python3.12
trevor/scarf-dep-fix
trevor/scarf-pip-local-fix
trevor/scarf-python-dep
tshen/add-post-chunking-strategy
update_ruff_invocation
update-table-html-extraction
v0.13.5-dev0
xlsx-sparse-connected-components
yao/add-table-experiment-script
yao/bump-inference-to-use-config
yao/core-1741-use-image-to-data
yao/duplicate-tsv-csv-to-run-ingest-update-and-tests
yao/skip-failing-delta-lake-tests
yuming/fix_install_tesseract_ci
yuming/get_ingest_output_paddle_onnx_runtime
yuming/get_ingest_output_paddle
yuming/get_ingest_output_tesseract
yuming/nex-28-chunker-error-nonetype-object-has-no-attribute
yuming/python3.11_everywhere
fixed type-o
mateusz-kuprowski
committed
2 years ago
1eeda89a
Added more debug messaging
mateusz-kuprowski
committed
2 years ago
c02a6c75
feat: keep all image elements when using `hi_res` strategy. (#2382)
christinestraub
committed
2 years ago
Verified
ee062609
pin unstructured-client (#2392)
Coniferish
committed
2 years ago
Verified
1f0826ab
enhancement: file detection for `.wav` files (#2387)
MthwRobinson
committed
2 years ago
Verified
36faf677
fix: elasticsearch serialization issue (#2399)
ryannikolaidis
committed
2 years ago
Verified
d7980b36
chore: make Elasticsearch Destination connector write settings optional (#2398)
ryannikolaidis
committed
2 years ago
Verified
f07fc6e0
test: update test Elasticsearch mappings to validate embedding search (#2397)
ryannikolaidis
committed
2 years ago
Verified
2ce829dd
fix: pinecone serialization issue (#2394)
ryannikolaidis
committed
2 years ago
Verified
018cd7f7
feat(ingest): add basic chunking to ingest (#2380)
scanny
committed
2 years ago
Verified
2f2c48ac
chore(ingest): update pinecone index creation specifications (#2389)
ahmetmeleq
committed
2 years ago
Verified
50f142d4
feat: Salesforce connector accepts key path or value (#2321) (#2327)
jakub-sandomierz-deepsense-ai
committed
2 years ago
Verified
411aa98b
fix: Ingest GCS accepts JSON auth token (#2322) (#2371)
jakub-sandomierz-deepsense-ai
committed
2 years ago
Verified
5581e6a4
chore: refactor _convert_to_standard_langcode (#2369)
Coniferish
committed
2 years ago
Verified
bfd0258b
fix: ensure consistency in method signatures across destination connectors (#2381)
rbiseck3
committed
2 years ago
Verified
8dc130c9
Fix sphinx error (#2384)
Ronny H
committed
2 years ago
Verified
98a0de30
feature(chunking): add basic strategy and overlap (#2367)
scanny
committed
2 years ago
Verified
23edf2e9
bug: don't redact text when serialization if not value (#2379)
rbiseck3
committed
2 years ago
Verified
a8a103bc
bug: weaviate serialization broken (#2378)
rbiseck3
committed
2 years ago
Verified
22c0bad2
drop python3.8 (#2372)
rbiseck3
committed
2 years ago
Verified
b37b4689
chore: bump unstructured-inference=0.7.21 (#2361)
christinestraub
committed
2 years ago
Verified
e2f0de3c
bug: omit session handler from serialization to avoid mp issues (#2366)
rbiseck3
committed
2 years ago
Verified
7caf2553
Fix: MongoDB connector URI password redaction, basic unit tests for Git connector (#2268)
jakub-sandomierz-deepsense-ai
committed
2 years ago
Verified
0ca154a0
feat: update cct eval for text dir (#2299)
Klaijan
committed
2 years ago
Verified
e65a44ea
chore: update ingest azure cognitive search endpoint (#2353)
ahmetmeleq
committed
2 years ago
Verified
d6674ba2
feat(chunking): add inter-chunk overlap (#2309)
scanny
committed
2 years ago
Verified
7a1e732a
fix(html): unequal row lengths in HTMLTable.text_as_html (#2345)
scanny
committed
2 years ago
Verified
22cbdce7
feat: adds postgresql/sqlite destination connector (#2005)
rvztz
committed
2 years ago
Verified
950e5d68
Refactor: rename image extraction kwargs (#2303)
christinestraub
committed
2 years ago
Verified
5b0ae3fd
Unstructured SaaS API subscription guide (#2341)
Ronny H
committed
2 years ago
Verified
8e2bfcab
Older