jonaswinkler/paperless-ng

A supercharged version of paperless: scan, index and archive all your physical documents

Archived

/ 100

Emerging

Performs automatic OCR and full-text indexing on documents (PDF, images, Office formats via Apache Tika), with machine learning-powered auto-tagging of correspondents and document types. Provides a modern single-page web frontend with relevance-ranked full-text search, email ingestion with filtering rules, and parallel document processing optimized for multi-core systems. Stores documents plainly on disk with configurable naming schemes, integrates with network scanners via FTP or mobile apps, and ships as a Docker Compose deployment.

5,416 stars. No commits in the last 6 months.

Archived Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

5,416

Forks

349

Language

Python

License

GPL-3.0

Compare

paperless-ng and paperless-ngx

Higher-rated alternatives

paperless-ngx/paperless-ngx

A community-supported supercharged document management system: scan, index and archive all your documents

GoogleCloudPlatform/document-ai-samples

Sample applications and demos for Document AI, the end-to-end document processing platform on...

aphp/edspdf

EDS-PDF is a generic, pure-Python framework for text extraction from PDF documents. It provides...

aws-solutions/document-understanding-solution

Example of integrating & using Amazon Textract, Amazon Comprehend, Amazon Comprehend Medical,...

naiveHobo/InvoiceNet

Deep neural network to extract intelligent information from invoice documents.

Explore ML Frameworks

All categories Trending ML Framework directory Insights