Unstructured-IO/unstructured

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

86
/ 100
Verified

14,211 stars and 4,977,320 monthly downloads. Used by 34 other packages. Actively maintained with 19 commits in the last 30 days. Available on PyPI.

Maintenance 17 / 25
Adoption 25 / 25
Maturity 25 / 25
Community 19 / 25

How are scores calculated?

Stars

14,211

Forks

1,194

Language

HTML

License

Apache-2.0

Last pushed

Mar 04, 2026

Monthly downloads

4,977,320

Commits (30d)

19

Dependencies

23

Reverse dependents

34

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/Unstructured-IO/unstructured"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.