EMUNES/Auto-Subtitle-File-Generation
Generate subtitle files with timelines in an automatic way.
ArchivedLeverages deep learning-based Sound Event Detection with PyTorch to identify speech segments and their precise timelines from audio/video inputs, outputting subtitles in .ass or .srt formats. The architecture uses pretrained PANNs audio neural networks combined with weak-label training, and integrates Vosk API for offline speech-to-text recognition to populate subtitle content. Includes open-source training pipeline with automated dataset generation from existing videos and subtitle files, enabling custom model fine-tuning.
No commits in the last 6 months.
Stars
62
Forks
17
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Aug 10, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/EMUNES/Auto-Subtitle-File-Generation"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
zerounintezaragler/whisper_python
Whisper Python Untuk mendapatkan teks dari sebuah audio kini tidak perlu convert manual tidak...
Dicklesworthstone/franken_whisper
Agent-first Rust ASR orchestration stack: Bayesian backend routing across...
sydkwests/kwest-whisper-analysis
Conducted a comprehensive technical analysis of the Whisper model on M-series hardware,...
atahanuz/yt2text
Extract text from a YouTube video in a single command, using OpenAi's Whisper speech recognition model.
Ayushverma135/Whisper-Hindi-ASR-model-IIT-Bombay-Internship
The Whisper Hindi ASR (Automatic Speech Recognition) model utilizes the KathBath dataset, a...