viet-tts and vietTTS
These are independent implementations of the same Vietnamese TTS approach, making them competitors—users would choose based on code quality, model performance, and maintenance activity rather than using both together.
About viet-tts
dangvansam/viet-tts
VietTTS: An Open-Source Vietnamese Text to Speech
Implements a prompt-based voice cloning architecture that generates speech by conditioning on reference audio, enabling zero-shot synthesis of new voices without retraining. Provides OpenAI API-compatible endpoints and supports both pre-built voices and custom voice cloning from local audio files. Available via Python package, Docker container, or command-line interface with streaming inference capabilities.
About vietTTS
NTT123/vietTTS
Vietnamese Text to Speech library
Combines a three-stage neural architecture—duration prediction, acoustic feature generation, and HiFiGAN vocoding—to synthesize Vietnamese speech from text. Trained on the denoised InfoRe dataset with forced alignment via Montreal Forced Aligner, it supports model finetuning on ground-truth mel-spectrograms and experimental multi-speaker synthesis on a separate branch. Implemented in JAX/Haiku with PyTorch vocoder conversion, enabling both offline synthesis and integration into production pipelines via pretrained model inference.
Scores updated daily from GitHub, PyPI, and npm data. How scores work