Kyubyong/dc_tts
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Implements the DC-TTS architecture with a two-stage pipeline: Text2Mel generates mel-spectrograms from text using guided attention for monotonic alignment, while SSRN (speaker-dependent vocoder) converts spectrograms to waveforms. Supports multilingual training across English and Korean datasets, with practical modifications including layer normalization and learning rate decay to improve convergence over the original paper's approach.
1,159 stars. No commits in the last 6 months.
Stars
1,159
Forks
360
Language
Python
License
Apache-2.0
Category
Last pushed
Apr 14, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Kyubyong/dc_tts"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
bshall/Tacotron
A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Kyubyong/tacotron
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Rayhane-mamah/Tacotron-2
DeepMind's Tacotron-2 Tensorflow implementation
DemisEom/SpecAugment
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
vlomme/Multi-Tacotron-Voice-Cloning
Phoneme multilingual(Russian-English) voice cloning based on