rishikksh20/AdaSpeech
AdaSpeech: Adaptive Text to Speech for Custom Voice
Built on FastSpeech 2, this PyTorch implementation applies utterance-level and phoneme-level acoustic embeddings to improve generalization across speaking styles without requiring condition layer normalization. It preprocesses audio using Montreal Forced Aligner for duration extraction and normalizes prosodic features (F0, energy) per dataset, then trains an encoder-decoder architecture that conditions acoustic predictions on speaker characteristics. Targets single-speaker TTS scenarios where adaptive acoustic modeling enhances synthesis quality beyond base FastSpeech 2.
162 stars. No commits in the last 6 months.
Stars
162
Forks
43
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Aug 31, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/rishikksh20/AdaSpeech"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
TensorSpeech/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for...
lucasnewman/nanospeech
A simple, hackable text-to-speech system in PyTorch and MLX
Tomiinek/Multilingual_Text_to_Speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing,...
yl4579/PL-BERT
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions
rishikksh20/FastSpeech2
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech