amit-timalsina/document_classification

All in one package for Document (image, pdf) Classification. Unified Interface for google ocr and tesseract. Train, evaluate, and infer using fasttext, Small language models (NER), Small Vision Language Models (layoutlm), and LLM.

/ 100

Experimental

No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 9 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

MIT

Category

document-data-extraction

Last pushed

Dec 13, 2024

Commits (30d)

GitHub

Document Data Extraction · 74 tools

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/amit-timalsina/document_classification"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

NanoNets/docstrange

Extract and convert data from any document, images, pdfs, word doc, ppt or URL into multiple...

hashangit/Extract2MD

Extract2MD is a powerful and versatile AI-enabled client-side JavaScript library for extracting...

Dicklesworthstone/llm_aided_ocr

Enhances Tesseract OCR output using LLMs (local or API) for error correction, smart chunking,...

th1nhhdk/local_ai_ocr

An local, offline (after initial setup), portable OCR software that can process images and PDF...

emcf/thepipe

Get clean data from tricky documents, powered by vision-language models ⚡

Explore LLM Tools

All categories Trending LLM Tool directory Insights