keonlee9420/DailyTalk
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023
Contains 2,541 recorded dialogues with annotated conversational attributes, enabling training of context-aware TTS systems that model dialogue history. The baseline implements non-autoregressive transformer-based synthesis with optional historical encoders (supporting the Guo et al. approach) and uses HiFi-GAN vocoding, with unsupervised phoneme-level duration modeling and Montreal Forced Aligner for alignment extraction.
252 stars. No commits in the last 6 months.
Stars
252
Forks
14
Language
Python
License
MIT
Category
Last pushed
Jun 05, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/keonlee9420/DailyTalk"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hetpandya/youtube_tts_data_generator
A python library to generate speech dataset from Youtube videos
IS2AI/Kazakh_TTS
An expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis...
Hecate2/sukasuka-vocal-dataset-builder
すかすかアニメボカロデータセット。1st anime vocal dataset. Extract audio (vocal) files from video based on .ass...
youmebangbang/TTS-dataset-tools
Automatically generates TTS dataset using audio and associated text. Make cuts under a custom...
taresh18/TTSizer
🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨