Real-Time-Voice-Cloning and Voice-Cloning-App

These are competitors offering similar real-time voice synthesis capabilities, with A distinguished by its faster 5-second enrollment time and significantly larger community adoption, while B provides a more accessible Python/PyTorch application interface for the same core voice cloning task.

Real-Time-Voice-Cloning
65
Established
Voice-Cloning-App
42
Emerging
Maintenance 16/25
Adoption 10/25
Maturity 16/25
Community 23/25
Maintenance 0/25
Adoption 10/25
Maturity 9/25
Community 23/25
Stars: 59,518
Forks: 9,422
Downloads:
Commits (30d): 1
Language: Python
License:
Stars: 1,443
Forks: 238
Downloads:
Commits (30d): 0
Language: Python
License: BSD-3-Clause
No Package No Dependents
Stale 6m No Package No Dependents

About Real-Time-Voice-Cloning

CorentinJ/Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Implements the three-stage SV2TTS framework combining a GE2E speaker encoder with Tacotron synthesis and WaveRNN vocoder to enable real-time speech generation from speaker embeddings. Provides both GUI and CLI interfaces supporting CPU/GPU inference, with pretrained models automatically downloaded from Hugging Face. While noted as an older reference implementation, it remains a functional open-source alternative to contemporary commercial voice cloning services.

About Voice-Cloning-App

voice-cloning-app/Voice-Cloning-App

A Python/Pytorch app for easily synthesising human voices

Supports multilingual voice cloning through automated dataset generation from subtitles and audiobooks, with local or remote training across multiple GPUs. Built on a reworked Tacotron2 architecture paired with HiFi-GAN vocoding for high-quality synthesis. Integrates Mozilla's DSAlign for forced alignment, Silero for voice activity detection, and offers remote training via Google Colab notebooks.

Scores updated daily from GitHub, PyPI, and npm data. How scores work