Chatterbox-TTS-Server and Chatterbox-TTS-Extended

The Extended version is a fork that removes constraints from the original, making them alternatives rather than complements—you would choose one based on whether you need character limits and interactive features (original) or batch text-file processing for long-form content (fork).

Chatterbox-TTS-Server

Verified

Chatterbox-TTS-Extended

Established

Maintenance 20/25

Adoption 10/25

Maturity 15/25

Community 25/25

Maintenance 2/25

Adoption 10/25

Maturity 15/25

Community 23/25

Stars: 1,101

Forks: 267

Downloads: —

Commits (30d): 23

Language: Python

License: MIT

Stars: 534

Forks: 94

Downloads: —

Commits (30d): 0

Language: Python

License: MIT

No Package No Dependents

Stale 6m No Package No Dependents

About Chatterbox-TTS-Server

devnen/Chatterbox-TTS-Server

Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), predefined voices, voice cloning, and large audiobook-scale text processing. Runs accelerated on NVIDIA (CUDA), AMD (ROCm), and CPU.

Supports three distinct Chatterbox model variants—Original, Multilingual (23 languages), and Turbo (350M parameters with single-step audio diffusion)—all hot-swappable via UI dropdown without server restart. Built on FastAPI with intelligent text chunking for audiobook-scale processing, generation seeds for reproducible voices, and native paralinguistic tags (`[laugh]`, `[cough]`) in Turbo for expressive agent narratives. Includes portable Windows mode with embedded Python runtime for zero-dependency deployment.

About Chatterbox-TTS-Extended

petermg/Chatterbox-TTS-Extended

Modified version of Chatterbox that accepts text files as input and no character restrictions. I use it to make audiobooks, especially for my kids.

Supports batch processing of multiple text files with advanced audio quality controls including RNNoise denoising, FFmpeg normalization, and Auto-Editor artifact removal. Built on a modular pipeline with Whisper/faster-whisper validation per chunk, configurable parallelism, and deterministic seed-based reproducibility for consistent regeneration. Includes voice conversion capabilities, persistent UI settings, and flexible text preprocessing (reference removal, sound word substitution, sentence batching).

Related comparisons

Chatterbox-TTS-Server and Dia-TTS-Server Chatterbox-TTS-Server and chatterbox-api Chatterbox-TTS-Server and Dia-TTS-Server

Scores updated daily from GitHub, PyPI, and npm data. How scores work