Tacotron-2 and tacotron

These are competing implementations of the same foundational text-to-speech architecture, with Tacotron-2 representing DeepMind's improved successor model while the original Tacotron serves as an alternative baseline implementation.

Tacotron-2

Established

tacotron

Established

Maintenance 0/25

Adoption 10/25

Maturity 16/25

Community 25/25

Maintenance 0/25

Adoption 10/25

Maturity 16/25

Community 25/25

Stars: 2,317

Forks: 904

Downloads: —

Commits (30d): 0

Language: Python

License: MIT

Stars: 1,833

Forks: 431

Downloads: —

Commits (30d): 0

Language: Python

License: Apache-2.0

Stale 6m No Package No Dependents

About Tacotron-2

Rayhane-mamah/Tacotron-2

DeepMind's Tacotron-2 Tensorflow implementation

Combines a sequence-to-sequence spectrogram prediction network with WaveNet vocoding for end-to-end text-to-speech synthesis in TensorFlow. Supports modular training of Tacotron and WaveNet components separately or jointly, with configurable hyperparameters for reproduction of paper results or enhanced performance variants. Handles LJSpeech and M-AILABS datasets with preprocessing pipelines and includes Griffin-Lim inversion tools for mel-spectrogram validation.

About tacotron

Kyubyong/tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

Implements encoder-decoder architecture with attention mechanism and Griffin-Lim vocoder for mel-spectrogram-to-waveform conversion, trained on multiple public datasets (LJ Speech, audiobooks, Bible recordings). Includes heavily documented training pipeline with bucketed batches, Noam learning rate scheduling, and gradient clipping, plus pre-trained checkpoints and attention visualization tools for monitoring alignment quality during training.

Related comparisons

Tacotron-2 and Tacotron-pytorch Tacotron-2 and Tacotron Tacotron-2 and Tacotron-pytorch Tacotron-2 and tacotron_asr

Scores updated daily from GitHub, PyPI, and npm data. How scores work