GeminiASR and gemini-speech2srt

These are **competitors**: both tools independently convert audio/video to SRT subtitles using the Google Gemini API, performing the same transcription-to-subtitle function without dependency on each other.

GeminiASR

Emerging

gemini-speech2srt

Emerging

Maintenance 6/25

Adoption 6/25

Maturity 9/25

Community 15/25

Maintenance 0/25

Adoption 8/25

Maturity 8/25

Community 18/25

Stars: 17

Forks: 5

Downloads: —

Commits (30d): 0

Language: Python

License: MIT

Stars: 54

Forks: 13

Downloads: —

Commits (30d): 0

Language: Python

License: —

No Package No Dependents

No License Stale 6m No Package No Dependents

About GeminiASR

cxyfer/GeminiASR

A Python tool that uses Google Gemini API to transcribe video or audio files into SRT subtitle files.

Supports batch processing with configurable chunking (default 900-second segments) and multi-threaded parallel transcription, with optional API key rotation across multiple Google keys to bypass rate limits. Features a flexible four-tier configuration system (CLI args > environment variables > TOML files > defaults) and supports custom prompts to guide transcription quality. Compatible with OpenAI-compatible endpoints and proxy services like gemini-balance, enabling use of alternative model providers while maintaining the same SRT output interface.

About gemini-speech2srt

jianchang512/gemini-speech2srt

使用 Gemini AI 转写音视频为 SRT 字幕

Implements intelligent audio segmentation using VAD (Voice Activity Detection) to split media into chunks before sending each to Gemini AI, ensuring precise subtitle timing that avoids the axis drift occurring with full-file processing. Provides both a Windows GUI executable and cross-platform Python deployment, with configurable prompts and proxy support for regions where Gemini access is restricted.

Scores updated daily from GitHub, PyPI, and npm data. How scores work