Tacotron and tacotron

These are **competitors** — both are independent implementations of the same Tacotron text-to-speech architecture in different deep learning frameworks (PyTorch vs. TensorFlow), serving the same use case without dependency on each other.

Tacotron
58
Established
tacotron
51
Established
Maintenance 0/25
Adoption 13/25
Maturity 25/25
Community 20/25
Maintenance 0/25
Adoption 10/25
Maturity 16/25
Community 25/25
Stars: 115
Forks: 26
Downloads: 32
Commits (30d): 0
Language: Python
License: MIT
Stars: 1,833
Forks: 431
Downloads:
Commits (30d): 0
Language: Python
License: Apache-2.0
Stale 6m
Stale 6m No Package No Dependents

About Tacotron

bshall/Tacotron

A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis

Implements location-relative attention with dynamic convolution to improve alignment robustness in text-to-mel-spectrogram synthesis, enabling stable training on single GPUs with mixed precision. Integrates with the UniversalVocoder for end-to-end audio generation from text via CMUDict phoneme conversion. Provides pretrained LJSpeech weights and preprocessing utilities for dataset training, with architectural optimizations including gradient clipping and modified learning schedules for efficient single-GPU convergence.

About tacotron

Kyubyong/tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

Implements encoder-decoder architecture with attention mechanism and Griffin-Lim vocoder for mel-spectrogram-to-waveform conversion, trained on multiple public datasets (LJ Speech, audiobooks, Bible recordings). Includes heavily documented training pipeline with bucketed batches, Noam learning rate scheduling, and gradient clipping, plus pre-trained checkpoints and attention visualization tools for monitoring alignment quality during training.

Scores updated daily from GitHub, PyPI, and npm data. How scores work