stochastic-sisyphus/text-feature-span-extractor
Deterministic invoice extraction using native PDF text layers. No OCR nonsense, no brittle rules that break at scale, no vendor lock-in paying exorbitant prices for creative interpretations of financial documents. This is my battle, I pick this hill!
Stars
1
Forks
—
Language
Python
License
—
Category
Last pushed
Mar 23, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/stochastic-sisyphus/text-feature-span-extractor"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
pd3f/pd3f
🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based
kiku-jw/DocStripper
🧹 DocStripper is a lightweight CLI utility that automatically cleans text documents
climate-nlp/reportparse
ReportParse is a unified NLP analyzer for corporate sustainability reports
jwc524/clippy
A smart PDF reader that extracts text and generates headings and summaries using NLP methods.
TheAkshatGupta/Intelligent-Document-Parsing-FinTech
NLP-based system to extract structured information from financial documents