Hecate2/sukasuka-vocal-dataset-builder
すかすかアニメボカロデータセット。1st anime vocal dataset. Extract audio (vocal) files from video based on .ass subtitle files; manually label vocal files to characters. Will be used for PITS/VITS/Diffusion text-to-speech/SVC. 根据字幕,从视频里抽取全部语音,然后手动按角色标注。
Stars
49
Forks
4
Language
Python
License
MIT
Category
Last pushed
Feb 25, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Hecate2/sukasuka-vocal-dataset-builder"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hetpandya/youtube_tts_data_generator
A python library to generate speech dataset from Youtube videos
IS2AI/Kazakh_TTS
An expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis...
youmebangbang/TTS-dataset-tools
Automatically generates TTS dataset using audio and associated text. Make cuts under a custom...
taresh18/TTSizer
🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨
keonlee9420/DailyTalk
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023