fishaudio/fish-speech

SOTA Open Source TTS

/ 100

Established

Implements a Dual-Autoregressive architecture combining a 4B-parameter slow decoder with a 400M fast decoder for semantic and acoustic codebook generation, trained on 10M+ hours across 80+ languages. Supports sub-word prosody and emotion control via inline natural language tags (e.g., `[whisper]`, `[excited]`), enabling multi-speaker conversations with reinforcement learning alignment for instruction adherence and naturalness.

26,613 stars. Actively maintained with 26 commits in the last 30 days.

No Package No Dependents

Maintenance 23 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

26,613

Forks

2,237

Language

Python

License

—

Related tools

Blaizzy/mlx-audio

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's...

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server...

mlalma/kokoro-ios

Kokoro TTS for iOS and macOSX

sidharthrajaram/StyleTTS2

🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning

mlalma/KokoroTestApp

Test application for Kokoro TTS model

Explore Voice AI Tools

All categories Trending Voice AI directory Insights