dunky11/voicesmith
[WIP] VoiceSmith makes training text to speech models easy.
Built on a two-stage pipeline combining modified DelightfulTTS and UnivNet architectures pretrained on 5000 speakers, it enables fine-tuning for single and multispeaker TTS without coding. Includes automatic text normalization and dataset preprocessing tools, with GPU acceleration via CUDA and containerized training through Docker. Targets Windows and Linux with a desktop installer, supporting inference on both custom and emotional speech datasets.
229 stars. No commits in the last 6 months.
Stars
229
Forks
33
Language
Python
License
Apache-2.0
Category
Last pushed
Oct 10, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/dunky11/voicesmith"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
herimor/voxtream
VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and Speaking rate Control
EveryVoiceTTS/EveryVoice
The EveryVoice TTS Toolkit - Text To Speech for your language
kadirnar/VoiceHub
VoiceHub: A Unified Inference Interface for TTS Models
NeonGeckoCom/neon-tts-plugin-coqui
Coqui AI TTS plugin
Atm4x/tts-with-rvc
TTS with RVC-module to generate .wav audios