kokoro-tts and kokorodoki
A is a mature, feature-rich CLI implementation of the Kokoro TTS model with broad format support, while B is a lightweight real-time TTS application built on the same underlying Kokoro model—making them complementary tools serving different use cases (batch processing vs. interactive deployment).
About kokoro-tts
nazdridoy/kokoro-tts
A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.
Builds on ONNX Runtime for efficient inference and leverages chunk-based processing with configurable speech speed and format output (WAV/MP3). The tool integrates stdin piping for workflow composition and intelligently extracts chapter structure from EPUB/PDF files, enabling organized batch processing of long-form content with optional per-chapter audio splitting.
About kokorodoki
eel-brah/kokorodoki
Natural-sounding Text-to-Speech App that fits anywhere. Fast, Real-Time and flexible.
Built on the lightweight Kokoro-82M model, it supports 8 languages and 54+ voices with optional CUDA GPU acceleration for low-latency synthesis. Four operational modes—Console, GUI, Daemon, and CLI—enable diverse integration patterns, including clipboard monitoring in Daemon mode and SRT subtitle synchronization for timed audio generation.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work