abus-aikorea/voice-pro
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
Provides integrated speech-to-text pipeline combining multiple Whisper variants (Faster-Whisper, WhisperX) for optimized accuracy and timing precision, then routes output through Deep-Translator for 100+ language support. Built on Gradio with CUDA acceleration for Windows/NVIDIA GPU environments, orchestrating Demucs/MDX-Net vocal isolation, RVC voice conversion, and multiple TTS engines (kokoro, Edge-TTS, CosyVoice) into a unified multimedia dubbing workflow without external API dependencies.
6,366 stars.
Stars
6,366
Forks
687
Language
Python
License
GPL-3.0
Category
Last pushed
Dec 05, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/abus-aikorea/voice-pro"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
snakers4/silero-models
Silero Models: pre-trained text-to-speech models made embarrassingly simple
snakers4/silero-stress
Silero Stress — pre-trained enterprise-grade automated stress and homograph disambiguation for...
JSchmie/ScrAIbe-WebUI
WebUI for ScAIbe
isaiahbjork/orpheus-tts-local
Run Orpheus 3B Locally With LM Studio
aman179102/podvoice
Local-first CLI that turns Markdown scripts into multi-speaker podcast-style audio using Coqui XTTS v2.