WhisperLive and whisper-clip
WhisperLive provides a streaming speech-to-text inference server, while WhisperClip is a client application that could potentially use such a backend to transcribe audio to clipboard—making them potential complements rather than competitors, though WhisperClip's architecture isn't explicitly tied to WhisperLive.
About WhisperLive
collabora/WhisperLive
A nearly-live implementation of OpenAI's Whisper.
Supports multiple inference backends (Faster-Whisper, TensorRT-LLM, and OpenVINO) for optimized performance across different hardware, with pluggable model sizes and a client-server architecture for concurrent transcription. Features Voice Activity Detection, real-time translation between any languages, and OpenAI-compatible REST API endpoints alongside native WebSocket streaming for low-latency audio input from microphones or files.
About whisper-clip
gustavostz/whisper-clip
WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly to your clipboard. With just a click of a button, you can effortlessly convert spoken words into written text, ready to be pasted wherever you need it. This application harnesses the power of OpenAI’s Whisper for free.
# Technical Summary Leverages `faster-whisper` for accelerated local inference with quantization options (int8, float16) and optional CUDA GPU support, ensuring all audio processing remains on-device. Exposes a FastAPI-based transcription server enabling remote mobile clients via iOS Shortcuts and Android to submit audio over VPN (Tailscale/Meshnet) and receive clipboard-ready transcriptions. Includes real-time audio visualization, configurable hotword biasing, and a system tray interface with global hotkey activation.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work