cxyfer/GeminiASR
A Python tool that uses Google Gemini API to transcribe video or audio files into SRT subtitle files.
Supports batch processing with configurable chunking (default 900-second segments) and multi-threaded parallel transcription, with optional API key rotation across multiple Google keys to bypass rate limits. Features a flexible four-tier configuration system (CLI args > environment variables > TOML files > defaults) and supports custom prompts to guide transcription quality. Compatible with OpenAI-compatible endpoints and proxy services like gemini-balance, enabling use of alternative model providers while maintaining the same SRT output interface.
Stars
17
Forks
5
Language
Python
License
MIT
Category
Last pushed
Jan 02, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/cxyfer/GeminiASR"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Higher-rated alternatives
mozilla-ai/document-to-podcast
Blueprint by Mozilla.ai for generating podcasts from documents using local AI
iMicknl/azure-podcast-generator
Generate an engaging podcast based on your document using Azure OpenAI and Azure Speech.
BandarLabs/gitpodcast
Convert any git repository into an engaging podcast
puntorigen/podcast_tts
A class for generating realistic audio (TTS) for podcasts and dialogues.
ismailperim/reportcast
Transform reports into podcasts with AI - Nobody reads your reports. But they'll listen.