BoltzmannEntropy/xtts2-ui

A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speech

/ 100

Emerging

Built on Coqui's XTTS-v2 multilingual model, this project provides both a web UI (Streamlit) and terminal interface for voice cloning across 16 languages with integrated recording and file upload capabilities. The architecture supports GPU acceleration via PyTorch CUDA and automatically downloads pretrained models on first run, with the cloning process requiring only a 10-second 24kHz WAV reference sample to generate speech in the target voice and language.

391 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 22 / 25

How are scores calculated?

Stars

391

Forks

Language

Python

License

MIT

Compare

xtts2-ui and xtts-webui

Higher-rated alternatives

herimor/voxtream

VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and Speaking rate Control

EveryVoiceTTS/EveryVoice

The EveryVoice TTS Toolkit - Text To Speech for your language

kadirnar/VoiceHub

VoiceHub: A Unified Inference Interface for TTS Models

NeonGeckoCom/neon-tts-plugin-coqui

Coqui AI TTS plugin

Atm4x/tts-with-rvc

TTS with RVC-module to generate .wav audios

Explore Voice AI Tools

All categories Trending Voice AI directory Insights