ekwek1/soprano-factory
Soprano-Factory: Train your own 2000x realtime text-to-speech model
Built on the Soprano ultra-lightweight TTS architecture, this training framework enables custom model fine-tuning on local hardware using LJSpeech-formatted datasets with automatic 32kHz resampling. The 600-line implementation supports adding new voices and languages while maintaining the base model's efficiency (80M parameters, <1GB memory) and extreme inference speed. Trained models integrate directly with the Soprano inference pipeline across CUDA, CPU, and MPS devices on Windows, Linux, and macOS.
212 stars.
Stars
212
Forks
33
Language
Python
License
Apache-2.0
Category
Last pushed
Jan 13, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/ekwek1/soprano-factory"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
TuananhCR/Dia-Finetuning-Vietnamese
TTS Dia finetuning for Vietnamese
thinhlpg/vixtts-demo
A Vietnamese Voice Cloning Text-to-Speech Model ✨
dangvansam/viet-tts
VietTTS: An Open-Source Vietnamese Text to Speech
NTT123/vietTTS
Vietnamese Text to Speech library
modelscope/KAN-TTS
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at ...