Chatterbox-TTS-Server and Chatterbox-TTS-Extended

The Extended version is a fork that removes constraints from the original, making them alternatives rather than complements—you would choose one based on whether you need character limits and interactive features (original) or batch text-file processing for long-form content (fork).

Chatterbox-TTS-Server
70
Verified
Chatterbox-TTS-Extended
50
Established
Maintenance 20/25
Adoption 10/25
Maturity 15/25
Community 25/25
Maintenance 2/25
Adoption 10/25
Maturity 15/25
Community 23/25
Stars: 1,101
Forks: 267
Downloads:
Commits (30d): 23
Language: Python
License: MIT
Stars: 534
Forks: 94
Downloads:
Commits (30d): 0
Language: Python
License: MIT
No Package No Dependents
Stale 6m No Package No Dependents

About Chatterbox-TTS-Server

devnen/Chatterbox-TTS-Server

Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), predefined voices, voice cloning, and large audiobook-scale text processing. Runs accelerated on NVIDIA (CUDA), AMD (ROCm), and CPU.

Supports three distinct Chatterbox model variants—Original, Multilingual (23 languages), and Turbo (350M parameters with single-step audio diffusion)—all hot-swappable via UI dropdown without server restart. Built on FastAPI with intelligent text chunking for audiobook-scale processing, generation seeds for reproducible voices, and native paralinguistic tags (`[laugh]`, `[cough]`) in Turbo for expressive agent narratives. Includes portable Windows mode with embedded Python runtime for zero-dependency deployment.

About Chatterbox-TTS-Extended

petermg/Chatterbox-TTS-Extended

Modified version of Chatterbox that accepts text files as input and no character restrictions. I use it to make audiobooks, especially for my kids.

Supports batch processing of multiple text files with advanced audio quality controls including RNNoise denoising, FFmpeg normalization, and Auto-Editor artifact removal. Built on a modular pipeline with Whisper/faster-whisper validation per chunk, configurable parallelism, and deterministic seed-based reproducibility for consistent regeneration. Includes voice conversion capabilities, persistent UI settings, and flexible text preprocessing (reference removal, sound word substitution, sentence batching).

Scores updated daily from GitHub, PyPI, and npm data. How scores work