kha-white/manga-ocr
Optical character recognition for Japanese text, with the main focus being Japanese manga
Built on Transformers' Vision Encoder Decoder architecture, it handles multi-line text recognition in a single forward pass—enabling entire manga speech bubbles to be processed without line splitting. The model is specifically trained to robustly handle manga-specific challenges including vertical/horizontal text, furigana annotations, image overlays, and low-quality scans. Integrates with clipboard and directory monitoring for background processing, enabling workflows with screenshot tools (ShareX, Flameshot) and dictionary lookup applications like Yomitan.
2,582 stars and 17,983 monthly downloads. No commits in the last 6 months. Available on PyPI.
Stars
2,582
Forks
127
Language
Python
License
Apache-2.0
Category
Last pushed
Jun 14, 2025
Monthly downloads
17,983
Commits (30d)
0
Dependencies
10
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/kha-white/manga-ocr"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
clusterzx/paperless-ai
An automated document analyzer for Paperless-ngx using OpenAI API, Ollama, Deepseek-r1, Azure...
bytefer/ollama-ocr
Implementing OCR with a local visual model run by ollama.
alephpi/Texo-web
The web application for Texo, a minimalist SOTA LaTeX OCR model which contains only 20M...
alephpi/Texo
A minimalist SOTA LaTeX OCR model with only 20M parameters, running in browser. Full training...
Dartvauder/NeuroSandboxWebUI
(Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on...