hash2430/pitchtron
TTS for pitch-accented language. Korean dialect DB.
Implements dual prosody transfer approaches (hard and soft) that enable pitch-contour-based style transfer for Korean dialects and emotional speech without requiring parallel training data or style-specific training sets. Hard pitchtron enforces strict pitch matching when sentence structures align, while soft pitchtron pursues naturalness even with entirely different reference and target sentences through pitch-range scaling and per-phoneme control. Built on multi-speaker Korean TTS pipelines with preprocessing support for converting raw audio formats and integrating multiple Korean speech datasets.
157 stars. No commits in the last 6 months.
Stars
157
Forks
29
Language
Python
License
—
Category
Last pushed
May 12, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/hash2430/pitchtron"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
bshall/Tacotron
A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Kyubyong/tacotron
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Rayhane-mamah/Tacotron-2
DeepMind's Tacotron-2 Tensorflow implementation
DemisEom/SpecAugment
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Kyubyong/dc_tts
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model