Tacotron-2 and tacotron

These are competing implementations of the same foundational text-to-speech architecture, with Tacotron-2 representing DeepMind's improved successor model while the original Tacotron serves as an alternative baseline implementation.

Tacotron-2
51
Established
tacotron
51
Established
Maintenance 0/25
Adoption 10/25
Maturity 16/25
Community 25/25
Maintenance 0/25
Adoption 10/25
Maturity 16/25
Community 25/25
Stars: 2,317
Forks: 904
Downloads:
Commits (30d): 0
Language: Python
License: MIT
Stars: 1,833
Forks: 431
Downloads:
Commits (30d): 0
Language: Python
License: Apache-2.0
Stale 6m No Package No Dependents
Stale 6m No Package No Dependents

About Tacotron-2

Rayhane-mamah/Tacotron-2

DeepMind's Tacotron-2 Tensorflow implementation

Combines a sequence-to-sequence spectrogram prediction network with WaveNet vocoding for end-to-end text-to-speech synthesis in TensorFlow. Supports modular training of Tacotron and WaveNet components separately or jointly, with configurable hyperparameters for reproduction of paper results or enhanced performance variants. Handles LJSpeech and M-AILABS datasets with preprocessing pipelines and includes Griffin-Lim inversion tools for mel-spectrogram validation.

About tacotron

Kyubyong/tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

Implements encoder-decoder architecture with attention mechanism and Griffin-Lim vocoder for mel-spectrogram-to-waveform conversion, trained on multiple public datasets (LJ Speech, audiobooks, Bible recordings). Includes heavily documented training pipeline with bucketed batches, Noam learning rate scheduling, and gradient clipping, plus pre-trained checkpoints and attention visualization tools for monitoring alignment quality during training.

Scores updated daily from GitHub, PyPI, and npm data. How scores work