CogStack/CogStack-Pipeline

Distributed, fault tolerant batch processing for Natural Language Applications and Search, using remote partitioning

Archived
42
/ 100
Emerging

Built on Spring Batch, it reads from multiple data sources (databases, files), applies configurable NLP processing steps, and outputs annotated JSON directly to Elasticsearch, files, or databases. The architecture uses Docker Compose for deployment and supports distributed processing with worker nodes for handling large-scale EHR datasets. Designed specifically for healthcare NLP workflows, it handles both structured and unstructured data including PDFs and clinical notes in resource-constrained environments.

No commits in the last 6 months.

Archived Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

48

Forks

13

Language

Java

License

Last pushed

Jan 13, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/CogStack/CogStack-Pipeline"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.