xtts-webui and xtts2-ui

These are competitors—both provide web UIs for the same XTTS voice cloning model, with the primary difference being that the second tool is specifically optimized for XTTS-2 and advertises shorter audio requirements (10 seconds vs. the first tool's broader approach).

xtts-webui

Established

xtts2-ui

Emerging

Maintenance 0/25

Adoption 10/25

Maturity 16/25

Community 25/25

Maintenance 0/25

Adoption 10/25

Maturity 16/25

Community 22/25

Stars: 877

Forks: 168

Downloads: —

Commits (30d): 0

Language: Python

License: MIT

Stars: 391

Forks: 67

Downloads: —

Commits (30d): 0

Language: Python

License: MIT

Stale 6m No Package No Dependents

About xtts-webui

daswer123/xtts-webui

Webui for using XTTS and for finetuning it

Integrates XTTSv2 with modular voice processing pipelines supporting RVC, OpenVoice, and Resemble Enhance for post-processing synthesis results. Provides batch audio dubbing with automatic translation while preserving speaker identity, plus fine-tuning capabilities with custom model selection and optimized export. Runs locally on NVIDIA GPUs (6GB+ VRAM) via PyTorch/CUDA, with optional deepspeed acceleration and low-VRAM mode for resource-constrained setups.

About xtts2-ui

BoltzmannEntropy/xtts2-ui

A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speech

Built on Coqui's XTTS-v2 multilingual model, this project provides both a web UI (Streamlit) and terminal interface for voice cloning across 16 languages with integrated recording and file upload capabilities. The architecture supports GPU acceleration via PyTorch CUDA and automatically downloads pretrained models on first run, with the cloning process requiring only a 10-second 24kHz WAV reference sample to generate speech in the target voice and language.

Scores updated daily from GitHub, PyPI, and npm data. How scores work