KinglittleQ/GST-Tacotron

A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

48
/ 100
Emerging

Implements Global Style Tokens (GST) with a Tacotron2 encoder-decoder architecture to learn unsupervised prosodic representations from speech spectrograms, enabling fine-grained style control and transfer across speakers without explicit style labels. Supports multispeaker datasets including Blizzard, with modular components for encoder/decoder networks, loss computation, and mel-spectrogram synthesis. Designed for PyTorch with configurable hyperparameters and dataset preprocessing pipelines for training end-to-end TTS models.

374 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 22 / 25

How are scores calculated?

Stars

374

Forks

71

Language

Python

License

MIT

Last pushed

Dec 08, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/KinglittleQ/GST-Tacotron"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.