jianchang512/gemini-speech2srt
使用 Gemini AI 转写音视频为 SRT 字幕
Implements intelligent audio segmentation using VAD (Voice Activity Detection) to split media into chunks before sending each to Gemini AI, ensuring precise subtitle timing that avoids the axis drift occurring with full-file processing. Provides both a Windows GUI executable and cross-platform Python deployment, with configurable prompts and proxy support for regions where Gemini access is restricted.
No commits in the last 6 months.
Stars
54
Forks
13
Language
Python
License
—
Category
Last pushed
Jan 11, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/jianchang512/gemini-speech2srt"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Higher-rated alternatives
mozilla-ai/document-to-podcast
Blueprint by Mozilla.ai for generating podcasts from documents using local AI
iMicknl/azure-podcast-generator
Generate an engaging podcast based on your document using Azure OpenAI and Azure Speech.
BandarLabs/gitpodcast
Convert any git repository into an engaging podcast
puntorigen/podcast_tts
A class for generating realistic audio (TTS) for podcasts and dialogues.
ismailperim/reportcast
Transform reports into podcasts with AI - Nobody reads your reports. But they'll listen.