keonlee9420/Expressive-FastSpeech2
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.
Non-autoregressive architecture enabling fast inference while conditioning on categorical or continuous emotion descriptors and conversational context through separate branch implementations. Includes annotated datasets (IEMOCAP for English, AIHub Multimodal for Korean) and language-specific text processing pipelines with Montreal Forced Aligner integration for adapting to new languages. Provides multi-speaker synthesis with emotion/conversation-aware prosody control as a PyTorch framework extending FastSpeech2's base architecture.
318 stars. No commits in the last 6 months.
Stars
318
Forks
48
Language
Python
License
—
Category
Last pushed
Aug 25, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/keonlee9420/Expressive-FastSpeech2"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
TensorSpeech/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for...
lucasnewman/nanospeech
A simple, hackable text-to-speech system in PyTorch and MLX
Tomiinek/Multilingual_Text_to_Speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing,...
jxzhanggg/nonparaSeq2seqVC_code
Implementation code of non-parallel sequence-to-sequence VC
keonlee9420/STYLER
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech...