WhisperLive and whisper-clip

WhisperLive provides a streaming speech-to-text inference server, while WhisperClip is a client application that could potentially use such a backend to transcribe audio to clipboard—making them potential complements rather than competitors, though WhisperClip's architecture isn't explicitly tied to WhisperLive.

WhisperLive
68
Established
whisper-clip
53
Established
Maintenance 20/25
Adoption 10/25
Maturity 16/25
Community 22/25
Maintenance 13/25
Adoption 10/25
Maturity 16/25
Community 14/25
Stars: 3,894
Forks: 536
Downloads:
Commits (30d): 13
Language: Python
License: MIT
Stars: 137
Forks: 16
Downloads:
Commits (30d): 0
Language: Python
License: MIT
No Package No Dependents
No Package No Dependents

About WhisperLive

collabora/WhisperLive

A nearly-live implementation of OpenAI's Whisper.

Supports multiple inference backends (Faster-Whisper, TensorRT-LLM, and OpenVINO) for optimized performance across different hardware, with pluggable model sizes and a client-server architecture for concurrent transcription. Features Voice Activity Detection, real-time translation between any languages, and OpenAI-compatible REST API endpoints alongside native WebSocket streaming for low-latency audio input from microphones or files.

About whisper-clip

gustavostz/whisper-clip

WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly to your clipboard. With just a click of a button, you can effortlessly convert spoken words into written text, ready to be pasted wherever you need it. This application harnesses the power of OpenAI’s Whisper for free.

# Technical Summary Leverages `faster-whisper` for accelerated local inference with quantization options (int8, float16) and optional CUDA GPU support, ensuring all audio processing remains on-device. Exposes a FastAPI-based transcription server enabling remote mobile clients via iOS Shortcuts and Android to submit audio over VPN (Tailscale/Meshnet) and receive clipboard-ready transcriptions. Includes real-time audio visualization, configurable hotword biasing, and a system tray interface with global hotkey activation.

Scores updated daily from GitHub, PyPI, and npm data. How scores work