tacotron and tacotron_asr

These are ecosystem siblings—one implements Tacotron for the TTS (text-to-speech) direction while the other adapts the same architecture for the reverse ASR (automatic speech recognition) direction, sharing the same foundational model design.

tacotron

Established

tacotron_asr

Emerging

Maintenance 0/25

Adoption 10/25

Maturity 16/25

Community 25/25

Maintenance 0/25

Adoption 10/25

Maturity 16/25

Community 21/25

Stars: 1,833

Forks: 431

Downloads: —

Commits (30d): 0

Language: Python

License: Apache-2.0

Stars: 164

Forks: 39

Downloads: —

Commits (30d): 0

Language: Python

License: Apache-2.0

Stale 6m No Package No Dependents

About tacotron

Kyubyong/tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

Implements encoder-decoder architecture with attention mechanism and Griffin-Lim vocoder for mel-spectrogram-to-waveform conversion, trained on multiple public datasets (LJ Speech, audiobooks, Bible recordings). Includes heavily documented training pipeline with bucketed batches, Noam learning rate scheduling, and gradient clipping, plus pre-trained checkpoints and attention visualization tools for monitoring alignment quality during training.

About tacotron_asr

Kyubyong/tacotron_asr

Speech Recognition Using Tacotron

Adapts the Tacotron text-to-speech architecture for automatic speech recognition by reversing the task flow—converting mel-spectrogram and linear spectrogram inputs to character-level text output. Built on TensorFlow 1.1 with attention-based encoder-decoder networks and trained on the World English Bible dataset (audio paired with verse-level text transcriptions). Demonstrates competitive results on long-form speech recognition while showcasing the architectural flexibility of the original Tacotron model for inverse tasks.

Related comparisons

tacotron and Tacotron tacotron and Tacotron-2 tacotron and Tacotron-pytorch

Scores updated daily from GitHub, PyPI, and npm data. How scores work