danielfrees/scrapemed

ScrapeMed: Data scraping for PubMed Central.

33
/ 100
Emerging

Provides pythonic object-oriented access to PubMed Central articles by downloading, validating, and parsing raw PMC XML into standardized `Paper` objects with extracted metadata, references, and structured sections. Integrates with ChromaDB and LangChain for semantic vectorization and natural language querying, while supporting pandas conversion for data science workflows and advanced search via PMC's search API.

No commits in the last 6 months. Available on PyPI.

Stale 6m
Maintenance 0 / 25
Adoption 10 / 25
Maturity 18 / 25
Community 5 / 25

How are scores calculated?

Stars

15

Forks

1

Language

Python

License

MIT

Last pushed

Jan 06, 2024

Monthly downloads

51

Commits (30d)

0

Dependencies

15

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/danielfrees/scrapemed"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.