TensorFlowTTS and tacotron2-mandarin
These are ecosystem siblings—TensorFlowTTS is a comprehensive framework that includes Tacotron2 as one of its supported model architectures, while the Mandarin implementation represents a specialized adaptation of that same architecture for a specific language.
About TensorFlowTTS
TensorSpeech/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
Implements modular encoder-decoder architectures (Tacotron-2, FastSpeech/2) paired with neural vocoders (MelGAN, HiFi-GAN, Parallel WaveGAN) for end-to-end text-to-speech, enabling sub-real-time inference through TensorFlow 2 optimization techniques like quantization-aware training and pruning. Supports deployment across diverse platforms including TFLite for mobile/embedded systems, C++ inference, and iOS, with pretrained models integrated into Hugging Face Hub. Provides language-agnostic processor pipeline for fine-tuning on new languages, demonstrated with Chinese, Korean, French, and German examples.
About tacotron2-mandarin
atomicoo/tacotron2-mandarin
Tensorflow implementation of Chinese/Mandarin TTS (Text-to-Speech) based on Tacotron-2 model.
Implements the seq2seq encoder-decoder architecture from DeepMind's Tacotron-2 paper, predicting mel-spectrograms from Chinese text input with Griffin-Lim vocoding for waveform synthesis. Supports multiple open datasets (BIAOBEI, THCHS-30) with preprocessing pipelines for audio normalization and mel-spectrogram extraction. Includes pretrained model checkpoints and evaluation utilities for monitoring synthesis quality during training.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work