Camb-ai/MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

/ 100

Emerging

Employs a two-stage autoregressive-non-autoregressive (AR-NAR) pipeline with multinomial DDPM refinement for high-fidelity prosody control, requiring only 5 seconds of reference audio for speaker cloning. Enables fine-grained prosody steering through punctuation and capitalization in the transcript, with optional "deep clone" mode using reference transcripts for enhanced quality. Distributed via torch.hub and HuggingFace with Docker support, supporting inference configurations for temperature, top-k sampling, and frequency penalty tuning.

2,814 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

2,814

Forks

246

Language

Jupyter Notebook

License

AGPL-3.0

Featured in

Choosing a Voice AI Library in 2026: What's Actually Worth Building On

Higher-rated alternatives

OpenBMB/VoxCPM

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

IAHispano/Applio

A simple, high-quality voice conversion tool focused on ease of use and performance.

myshell-ai/OpenVoice

Instant voice cloning by MIT and MyShell. Audio foundation model.

codename0og/codename-rvc-fork-4

Codename's rvc fork version 4, based on Applio.

JackismyShephard/ultimate-rvc

An app for creating audio-based content such as song covers and speech using Retrieval-based...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights