kha-white/manga-ocr

Optical character recognition for Japanese text, with the main focus being Japanese manga

/ 100

Established

Built on Transformers' Vision Encoder Decoder architecture, it handles multi-line text recognition in a single forward pass—enabling entire manga speech bubbles to be processed without line splitting. The model is specifically trained to robustly handle manga-specific challenges including vertical/horizontal text, furigana annotations, image overlays, and low-quality scans. Integrates with clipboard and directory monitoring for background processing, enabling workflows with screenshot tools (ShareX, Flameshot) and dictionary lookup applications like Yomitan.

2,582 stars and 17,983 monthly downloads. No commits in the last 6 months. Available on PyPI.

Stale 6m

Maintenance 2 / 25

Adoption 20 / 25

Maturity 25 / 25

Community 17 / 25

How are scores calculated?

Stars

2,582

Forks

127

Language

Python

License

Apache-2.0

Related models

clusterzx/paperless-ai

An automated document analyzer for Paperless-ngx using OpenAI API, Ollama, Deepseek-r1, Azure...

bytefer/ollama-ocr

Implementing OCR with a local visual model run by ollama.

alephpi/Texo-web

The web application for Texo, a minimalist SOTA LaTeX OCR model which contains only 20M...

alephpi/Texo

A minimalist SOTA LaTeX OCR model with only 20M parameters, running in browser. Full training...

Dartvauder/NeuroSandboxWebUI

(Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on...

Explore Transformer Models

All categories Trending Transformer directory Insights