rioharper/VocalForge
Your one-stop solution for voice dataset creation
Combines Whisper transcription, PyAnnote speaker diarization, and CTC segmentation to automatically process raw audio into aligned speech datasets with minimal manual curation. The toolkit handles speaker isolation, voice activity detection, noise filtering, and text normalization across multiple audio sources, then exports in LJSpeech format. Includes VCAuditor, a browser-based verification interface for reviewing waveforms, correcting alignments, and filtering low-confidence segments before final dataset export.
130 stars. No commits in the last 6 months.
Stars
130
Forks
24
Language
Python
License
MIT
Category
Last pushed
Dec 10, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/rioharper/VocalForge"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
voicegain/platform
Voicegain Enterprise Speech-to-Text Platform (API, Portal, etc.)
davidamacey/OpenTranscribe
Self-hosted AI-powered transcription platform with speaker diarization, search, and...
aws-samples/amazon-transcribe-live-call-analytics
Amazon Transcribe Live Call Analytics (LCA) Sample Solution
SamirPaulb/real-time-voice-translator
A desktop application that uses AI to translate voice between languages in real time, while...
jim-schwoebel/voicebook
🗣️ A book and repo to get you started programming voice computing applications in Python (10...