rishikksh20/AdaSpeech

AdaSpeech: Adaptive Text to Speech for Custom Voice

40
/ 100
Emerging

Built on FastSpeech 2, this PyTorch implementation applies utterance-level and phoneme-level acoustic embeddings to improve generalization across speaking styles without requiring condition layer normalization. It preprocesses audio using Montreal Forced Aligner for duration extraction and normalizes prosodic features (F0, energy) per dataset, then trains an encoder-decoder architecture that conditions acoustic predictions on speaker characteristics. Targets single-speaker TTS scenarios where adaptive acoustic modeling enhances synthesis quality beyond base FastSpeech 2.

162 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 9 / 25
Community 21 / 25

How are scores calculated?

Stars

162

Forks

43

Language

Jupyter Notebook

License

Apache-2.0

Last pushed

Aug 31, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/rishikksh20/AdaSpeech"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.