tacotron and tacotron_asr

These are ecosystem siblings—one implements Tacotron for the TTS (text-to-speech) direction while the other adapts the same architecture for the reverse ASR (automatic speech recognition) direction, sharing the same foundational model design.

tacotron
51
Established
tacotron_asr
47
Emerging
Maintenance 0/25
Adoption 10/25
Maturity 16/25
Community 25/25
Maintenance 0/25
Adoption 10/25
Maturity 16/25
Community 21/25
Stars: 1,833
Forks: 431
Downloads:
Commits (30d): 0
Language: Python
License: Apache-2.0
Stars: 164
Forks: 39
Downloads:
Commits (30d): 0
Language: Python
License: Apache-2.0
Stale 6m No Package No Dependents
Stale 6m No Package No Dependents

About tacotron

Kyubyong/tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

Implements encoder-decoder architecture with attention mechanism and Griffin-Lim vocoder for mel-spectrogram-to-waveform conversion, trained on multiple public datasets (LJ Speech, audiobooks, Bible recordings). Includes heavily documented training pipeline with bucketed batches, Noam learning rate scheduling, and gradient clipping, plus pre-trained checkpoints and attention visualization tools for monitoring alignment quality during training.

About tacotron_asr

Kyubyong/tacotron_asr

Speech Recognition Using Tacotron

Adapts the Tacotron text-to-speech architecture for automatic speech recognition by reversing the task flow—converting mel-spectrogram and linear spectrogram inputs to character-level text output. Built on TensorFlow 1.1 with attention-based encoder-decoder networks and trained on the World English Bible dataset (audio paired with verse-level text transcriptions). Demonstrates competitive results on long-form speech recognition while showcasing the architectural flexibility of the original Tacotron model for inverse tasks.

Scores updated daily from GitHub, PyPI, and npm data. How scores work