bshall/Tacotron

A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis

/ 100

Established

Implements location-relative attention with dynamic convolution to improve alignment robustness in text-to-mel-spectrogram synthesis, enabling stable training on single GPUs with mixed precision. Integrates with the UniversalVocoder for end-to-end audio generation from text via CMUDict phoneme conversion. Provides pretrained LJSpeech weights and preprocessing utilities for dataset training, with architectural optimizations including gradient clipping and modified learning schedules for efficient single-GPU convergence.

115 stars and 32 monthly downloads. No commits in the last 6 months. Available on PyPI.

Stale 6m

Maintenance 0 / 25

Adoption 13 / 25

Maturity 25 / 25

Community 20 / 25

How are scores calculated?

Stars

115

Forks

Language

Python

License

MIT

Compare

Tacotron and tacotron Tacotron and Tacotron-pytorch

Related tools

Kyubyong/tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

Rayhane-mamah/Tacotron-2

DeepMind's Tacotron-2 Tensorflow implementation

DemisEom/SpecAugment

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

Kyubyong/dc_tts

A TensorFlow Implementation of DC-TTS: yet another text-to-speech model

vlomme/Multi-Tacotron-Voice-Cloning

Phoneme multilingual(Russian-English) voice cloning based on

Explore Voice AI Tools

All categories Trending Voice AI directory Insights