NTT123/vietTTS
Vietnamese Text to Speech library
Combines a three-stage neural architecture—duration prediction, acoustic feature generation, and HiFiGAN vocoding—to synthesize Vietnamese speech from text. Trained on the denoised InfoRe dataset with forced alignment via Montreal Forced Aligner, it supports model finetuning on ground-truth mel-spectrograms and experimental multi-speaker synthesis on a separate branch. Implemented in JAX/Haiku with PyTorch vocoder conversion, enabling both offline synthesis and integration into production pipelines via pretrained model inference.
255 stars. No commits in the last 6 months.
Stars
255
Forks
104
Language
Python
License
MIT
Category
Last pushed
Aug 20, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/NTT123/vietTTS"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
TuananhCR/Dia-Finetuning-Vietnamese
TTS Dia finetuning for Vietnamese
thinhlpg/vixtts-demo
A Vietnamese Voice Cloning Text-to-Speech Model ✨
dangvansam/viet-tts
VietTTS: An Open-Source Vietnamese Text to Speech
ekwek1/soprano-factory
Soprano-Factory: Train your own 2000x realtime text-to-speech model
modelscope/KAN-TTS
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at ...