tacotron and Tacotron-pytorch
These are competing implementations of the same Tacotron architecture in different deep learning frameworks (TensorFlow vs. PyTorch), allowing users to choose based on their preferred framework preference rather than being designed to work together.
About tacotron
Kyubyong/tacotron
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Implements encoder-decoder architecture with attention mechanism and Griffin-Lim vocoder for mel-spectrogram-to-waveform conversion, trained on multiple public datasets (LJ Speech, audiobooks, Bible recordings). Includes heavily documented training pipeline with bucketed batches, Noam learning rate scheduling, and gradient clipping, plus pre-trained checkpoints and attention visualization tools for monitoring alignment quality during training.
About Tacotron-pytorch
soobinseo/Tacotron-pytorch
Pytorch implementation of Tacotron
Implements the full Tacotron architecture with encoder-decoder attention, CBHG modules, and mel-spectrogram generation for end-to-end text-to-speech synthesis. Preprocesses text into phoneme indices and audio into spectrograms, supporting the LJSpeech dataset pipeline. Includes separate training and inference scripts for model optimization and TTS sample generation.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work