CorentinJ/Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

65
/ 100
Established

Implements the three-stage SV2TTS framework combining a GE2E speaker encoder with Tacotron synthesis and WaveRNN vocoder to enable real-time speech generation from speaker embeddings. Provides both GUI and CLI interfaces supporting CPU/GPU inference, with pretrained models automatically downloaded from Hugging Face. While noted as an older reference implementation, it remains a functional open-source alternative to contemporary commercial voice cloning services.

59,518 stars. Actively maintained with 1 commit in the last 30 days.

No Package No Dependents
Maintenance 16 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 23 / 25

How are scores calculated?

Stars

59,518

Forks

9,422

Language

Python

License

Last pushed

Mar 09, 2026

Commits (30d)

1

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/CorentinJ/Real-Time-Voice-Cloning"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.