keonlee9420/DailyTalk

Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023

/ 100

Emerging

Contains 2,541 recorded dialogues with annotated conversational attributes, enabling training of context-aware TTS systems that model dialogue history. The baseline implements non-autoregressive transformer-based synthesis with optional historical encoders (supporting the Guo et al. approach) and uses HiFi-GAN vocoding, with unsupervised phoneme-level duration modeling and Montreal Forced Aligner for alignment extraction.

252 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 11 / 25

How are scores calculated?

Stars

252

Forks

Language

Python

License

MIT

Higher-rated alternatives

hetpandya/youtube_tts_data_generator

A python library to generate speech dataset from Youtube videos

IS2AI/Kazakh_TTS

An expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis...

Hecate2/sukasuka-vocal-dataset-builder

すかすかアニメボカロデータセット。1st anime vocal dataset. Extract audio (vocal) files from video based on .ass...

youmebangbang/TTS-dataset-tools

Automatically generates TTS dataset using audio and associated text. Make cuts under a custom...

taresh18/TTSizer

🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨

Explore Voice AI Tools

All categories Trending Voice AI directory Insights