CogStack/CogStack-Pipeline
Distributed, fault tolerant batch processing for Natural Language Applications and Search, using remote partitioning
ArchivedBuilt on Spring Batch, it reads from multiple data sources (databases, files), applies configurable NLP processing steps, and outputs annotated JSON directly to Elasticsearch, files, or databases. The architecture uses Docker Compose for deployment and supports distributed processing with worker nodes for handling large-scale EHR datasets. Designed specifically for healthcare NLP workflows, it handles both structured and unstructured data including PDFs and clinical notes in resource-constrained environments.
No commits in the last 6 months.
Stars
48
Forks
13
Language
Java
License
—
Category
Last pushed
Jan 13, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/CogStack/CogStack-Pipeline"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
docarray/docarray
Represent, send, store and search multimodal data
primeqa/primeqa
The prime repository for state-of-the-art Multilingual Question Answering research and development.
algoprog/Quin
An easy to use framework for large-scale fact-checking and question answering
danielfrees/scrapemed
ScrapeMed: Data scraping for PubMed Central.
ekatraone/Mobius-v1
Ekatra QnA is a student-focused intelligent search engine that enables them to find answers...