hetpandya/youtube_tts_data_generator

A python library to generate speech dataset from Youtube videos

53
/ 100
Established

Automatically downloads YouTube videos with subtitles, extracts audio, and aligns transcriptions through intelligent segmentation based on subtitle timing. Includes built-in preprocessing pipelines for silence trimming, audio concatenation with configurable length limits, and metadata generation in LJ Speech or JSON formats compatible with TTS frameworks. Supports multi-language subtitle extraction and produces standardized directory structures with paired audio/text files ready for speech synthesis model training.

No commits in the last 6 months. Available on PyPI.

Stale 6m
Maintenance 0 / 25
Adoption 11 / 25
Maturity 25 / 25
Community 17 / 25

How are scores calculated?

Stars

37

Forks

8

Language

Python

License

Apache-2.0

Last pushed

Jun 07, 2024

Monthly downloads

55

Commits (30d)

0

Dependencies

14

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/hetpandya/youtube_tts_data_generator"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.