Camb-ai/MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

45
/ 100
Emerging

Employs a two-stage autoregressive-non-autoregressive (AR-NAR) pipeline with multinomial DDPM refinement for high-fidelity prosody control, requiring only 5 seconds of reference audio for speaker cloning. Enables fine-grained prosody steering through punctuation and capitalization in the transcript, with optional "deep clone" mode using reference transcripts for enhanced quality. Distributed via torch.hub and HuggingFace with Docker support, supporting inference configurations for temperature, top-k sampling, and frequency penalty tuning.

2,814 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

2,814

Forks

246

Language

Jupyter Notebook

License

AGPL-3.0

Last pushed

Aug 01, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Camb-ai/MARS5-TTS"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.