Emotional-Text-to-Speech/dl-for-emo-tts

:computer: :robot: A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech :speaker:

/ 100

Emerging

Implements Tacotron and DC-TTS architectures fine-tuned on emotional speech datasets (RAVDESS, EMOV-DB) pre-trained on LJ Speech, systematically exploring transfer learning strategies like encoder freezing and learning rate adjustments. Provides reproducible training pipelines and comparative analysis across multiple emotion corpora, with successful configurations documented for single-speaker synthesis with monotonic attention constraints.

458 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

458

Forks

Language

Jupyter Notebook

License

MIT

Higher-rated alternatives

aqibsaeed/Urban-Sound-Classification

Urban sound classification using Deep Learning

spotify/realbook

Easier audio-based machine learning with TensorFlow.

ArmDeveloperEcosystem/ml-audio-classifier-example-for-pico

ML Audio Classifier Example for Pico 🔊🔥🔔

IliaZenkov/sklearn-audio-classification

An in-depth analysis of audio classification on the RAVDESS dataset. Feature engineering,...

mimbres/neural-audio-fp

Official implementation of Neural Audio Fingerprint (ICASSP 2021)

Explore ML Frameworks

All categories Trending ML Framework directory Insights