Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
Unstructured-IO/unstructured
Pull Requests
Commits
Open
Closed
fix(deps): Update opensearchproject/opensearch Docker tag to v2.19.5
dependencies
security
#4287 opened 2026-03-20 20:55 by
utic-renovate[bot]
refactor: don't import unstructured-inference via partition.pdf
#4284 opened 2026-03-16 13:48 by
artdent
fix: improve multi-column layout sorting for academic papers (#4104)
#4283 opened 2026-03-16 00:07 by
Gopesh111
Replace lazyproperty with functools.cached_property
#4282 opened 2026-03-10 01:51 by
KRRT7
refactor: replace deprecated decorators in partition_image with apply_metadata
#4271 opened 2026-03-02 12:55 by
HemantSudarshan
fix: add 'el' and 'gr' as Greek language code aliases for Tesseract OCR
#4270 opened 2026-02-27 18:45 by
s0wa48
fix(deps): Update semitechnologies/weaviate Docker tag to v1.36.6
dependencies
security
#4267 opened 2026-02-26 18:22 by
utic-renovate[bot]
fix: handle list output from group_bullet_paragraph in element apply()
#4253 opened 2026-02-21 20:04 by
s0wa48
Simple typo fix
#4251 opened 2026-02-20 08:06 by
rchen19
Feat: embedding model voyage 4 family
#4234 opened 2026-02-11 18:12 by
fzowl
feat: add XLSM (Excel Macro-Enabled Workbook) parsing support
#4227 opened 2026-02-08 16:51 by
longway-code
Add AgentMarket - B2A Marketplace
#4225 opened 2026-02-03 17:14 by
stromfee
docs: fix redundant whitespace in pyenv command in README
#4224 opened 2026-02-03 13:38 by
longway-code
fix(deps): Update docker.elastic.co/elasticsearch/elasticsearch Docker tag to v8.19.13
dependencies
security
#4223 opened 2026-02-03 12:19 by
utic-renovate[bot]
Fix FutureWarning: Add test to verify bytes are wrapped in BytesIO for read_excel
#4213 opened 2026-01-27 12:59 by
Achieve3318
⚡️ Speed up function `merge_out_layout_with_ocr_layout` by 30%
#4212 opened 2026-01-27 02:31 by
aseembits93
⚡️ Speed up function `standardize_quotes` by 144%
#4201 opened 2026-01-21 02:31 by
KRRT7
feat: chunking by character and title now isolates tables
#4197 opened 2026-01-15 19:26 by
badGarnet
fix: NameError: LayoutElements not defined in paddle_ocr.py
#4195 opened 2026-01-15 16:18 by
mohansinghi
Eliminate cleaners/core import time bottleneck
#4167 opened 2026-01-07 03:44 by
aseembits93
update README.md
#4121 opened 2025-11-12 10:57 by
vhsakpal
new file: .idx/mcp.json
#4111 opened 2025-11-05 02:21 by
romethefixer
Bug 4105
#4107 opened 2025-10-13 20:35 by
carminoplata
fix: None text attribute when normalizing Picture to Image element
#4083 opened 2025-08-22 15:25 by
ishahroz
Switch from pdfminer to paves to improve robustness and use multiple CPUs
#4067 opened 2025-07-19 04:10 by
dhdaines
perf: add early page count check to prevent expensive PDFMiner proces…
#4048 opened 2025-07-08 20:09 by
CyMule
Feature/remove unnessary re for table ele in pdf
#3984 opened 2025-04-09 11:24 by
JIAQIA
bugfix/fix missing extensions in file detection
#3926 opened 2025-02-18 17:24 by
rbiseck3
Improve readability of the text by adding new line to the end of row
#3913 opened 2025-02-07 14:56 by
Sheripov
fix: preserve text after line breaks in PowerPoint table cells
#3877 opened 2025-01-18 04:07 by
yamazombie
Older