CorentinJ/Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

/ 100

Established

Implements the three-stage SV2TTS framework combining a GE2E speaker encoder with Tacotron synthesis and WaveRNN vocoder to enable real-time speech generation from speaker embeddings. Provides both GUI and CLI interfaces supporting CPU/GPU inference, with pretrained models automatically downloaded from Hugging Face. While noted as an older reference implementation, it remains a functional open-source alternative to contemporary commercial voice cloning services.

59,518 stars. Actively maintained with 1 commit in the last 30 days.

No Package No Dependents

Maintenance 16 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 23 / 25

How are scores calculated?

Stars

59,518

Forks

9,422

Language

Python

License

—

Featured in

Choosing a Voice AI Library in 2026: What's Actually Worth Building On

Compare

Real-Time-Voice-Cloning and MockingBird Real-Time-Voice-Cloning and Voice-Cloning-App

Related tools

pnnbao97/VieNeu-TTS

Vietnamese TTS with instant voice cloning • On-device • Real-time CPU inference • 24kHz audio...

r9y9/nnmnkwii

Library to build speech synthesis systems designed for easy and fast prototyping.

babysor/MockingBird

🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time

Softcatala/open-dubbing

Open dubbing is an AI dubbing system which uses machine learning models to automatically...

coqui-ai/STT

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights