mcs07/ChemDataExtractor

Automatically extract chemical information from scientific documents

65
/ 100
Established

Combines multi-format document parsing (HTML, XML, PDF) with chemistry-aware NLP and rule-based grammars to extract chemical entities, properties, and spectroscopic data from unstructured text and tables. Features a document processing layer that resolves data interdependencies across extracted information, enabling structured knowledge assembly from scattered references within scientific papers.

349 stars and 518 monthly downloads. No commits in the last 6 months. Available on PyPI.

Stale 6m No Dependents
Maintenance 0 / 25
Adoption 16 / 25
Maturity 25 / 25
Community 24 / 25

How are scores calculated?

Stars

349

Forks

122

Language

Python

License

MIT

Last pushed

Jul 27, 2023

Monthly downloads

518

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/mcs07/ChemDataExtractor"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.