hetpandya/youtube_tts_data_generator
A python library to generate speech dataset from Youtube videos
Automatically downloads YouTube videos with subtitles, extracts audio, and aligns transcriptions through intelligent segmentation based on subtitle timing. Includes built-in preprocessing pipelines for silence trimming, audio concatenation with configurable length limits, and metadata generation in LJ Speech or JSON formats compatible with TTS frameworks. Supports multi-language subtitle extraction and produces standardized directory structures with paired audio/text files ready for speech synthesis model training.
No commits in the last 6 months. Available on PyPI.
Stars
37
Forks
8
Language
Python
License
Apache-2.0
Category
Last pushed
Jun 07, 2024
Monthly downloads
55
Commits (30d)
0
Dependencies
14
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/hetpandya/youtube_tts_data_generator"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
IS2AI/Kazakh_TTS
An expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis...
Hecate2/sukasuka-vocal-dataset-builder
すかすかアニメボカロデータセット。1st anime vocal dataset. Extract audio (vocal) files from video based on .ass...
youmebangbang/TTS-dataset-tools
Automatically generates TTS dataset using audio and associated text. Make cuts under a custom...
taresh18/TTSizer
🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨
keonlee9420/DailyTalk
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023