abus-aikorea/voice-pro

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

/ 100

Established

Provides integrated speech-to-text pipeline combining multiple Whisper variants (Faster-Whisper, WhisperX) for optimized accuracy and timing precision, then routes output through Deep-Translator for 100+ language support. Built on Gradio with CUDA acceleration for Windows/NVIDIA GPU environments, orchestrating Demucs/MDX-Net vocal isolation, RVC voice conversion, and multiple TTS engines (kokoro, Edge-TTS, CosyVoice) into a unified multimedia dubbing workflow without external API dependencies.

6,366 stars.

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

6,366

Forks

687

Language

Python

License

GPL-3.0

Related tools

snakers4/silero-models

Silero Models: pre-trained text-to-speech models made embarrassingly simple

snakers4/silero-stress

Silero Stress — pre-trained enterprise-grade automated stress and homograph disambiguation for...

JSchmie/ScrAIbe-WebUI

WebUI for ScAIbe

isaiahbjork/orpheus-tts-local

Run Orpheus 3B Locally With LM Studio

aman179102/podvoice

Local-first CLI that turns Markdown scripts into multi-speaker podcast-style audio using Coqui XTTS v2.

Explore Voice AI Tools

All categories Trending Voice AI directory Insights