BoltzmannEntropy/xtts2-ui
A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speech
Built on Coqui's XTTS-v2 multilingual model, this project provides both a web UI (Streamlit) and terminal interface for voice cloning across 16 languages with integrated recording and file upload capabilities. The architecture supports GPU acceleration via PyTorch CUDA and automatically downloads pretrained models on first run, with the cloning process requiring only a 10-second 24kHz WAV reference sample to generate speech in the target voice and language.
391 stars. No commits in the last 6 months.
Stars
391
Forks
67
Language
Python
License
MIT
Category
Last pushed
Dec 06, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/BoltzmannEntropy/xtts2-ui"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Higher-rated alternatives
herimor/voxtream
VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and Speaking rate Control
EveryVoiceTTS/EveryVoice
The EveryVoice TTS Toolkit - Text To Speech for your language
kadirnar/VoiceHub
VoiceHub: A Unified Inference Interface for TTS Models
NeonGeckoCom/neon-tts-plugin-coqui
Coqui AI TTS plugin
Atm4x/tts-with-rvc
TTS with RVC-module to generate .wav audios