ekwek1/soprano-factory

Soprano-Factory: Train your own 2000x realtime text-to-speech model

/ 100

Established

Built on the Soprano ultra-lightweight TTS architecture, this training framework enables custom model fine-tuning on local hardware using LJSpeech-formatted datasets with automatic 32kHz resampling. The 600-line implementation supports adding new voices and languages while maintaining the base model's efficiency (80M parameters, <1GB memory) and extreme inference speed. Trained models integrate directly with the Soprano inference pipeline across CUDA, CPU, and MPS devices on Windows, Linux, and macOS.

212 stars.

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 11 / 25

Community 19 / 25

How are scores calculated?

Stars

212

Forks

Language

Python

License

Apache-2.0

Related tools

TuananhCR/Dia-Finetuning-Vietnamese

TTS Dia finetuning for Vietnamese

thinhlpg/vixtts-demo

A Vietnamese Voice Cloning Text-to-Speech Model ✨

dangvansam/viet-tts

VietTTS: An Open-Source Vietnamese Text to Speech

NTT123/vietTTS

Vietnamese Text to Speech library

modelscope/KAN-TTS

KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at ...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights