babysor/MockingBird
🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time
Uses a modular three-stage architecture with pretrained speaker encoder and neural vocoder, training only a Mandarin-optimized synthesizer to reduce computational overhead. Operates as both a PyQt5 desktop toolbox and web server, supporting inference on GPU (CUDA) and CPU across Windows, Linux, and M1 Mac via Rosetta emulation. Extensively tested on Chinese speech datasets (aidatatang_200zh, aishell3, magicdata) with PyTorch 1.9.0+, allowing users to train custom synthesizers or leverage community pretrained models.
36,874 stars.
Stars
36,874
Forks
5,236
Language
Python
License
—
Category
Last pushed
Mar 03, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/babysor/MockingBird"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
pnnbao97/VieNeu-TTS
Vietnamese TTS with instant voice cloning • On-device • Real-time CPU inference • 24kHz audio...
r9y9/nnmnkwii
Library to build speech synthesis systems designed for easy and fast prototyping.
CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Softcatala/open-dubbing
Open dubbing is an AI dubbing system which uses machine learning models to automatically...
Amey-Thakur/DEEPFAKE-AUDIO
🎙️ Deepfake Audio – A neural voice cloning studio powered by SV2TTS technology.