WhisperKit and whisper-web
These are **complements** — WhisperKit provides an optimized inference engine for Apple Silicon devices, while whisper-web enables browser-based transcription, allowing developers to choose the platform (native iOS/macOS vs. web) best suited for their deployment needs.
About WhisperKit
argmaxinc/WhisperKit
On-device Speech Recognition for Apple Silicon
Wraps OpenAI's Whisper models in CoreML format optimized for Apple Silicon, enabling real-time streaming transcription with word-level timestamps and voice activity detection. Provides Swift SDK integration, a local HTTP server with Deepgram-compatible WebSocket API, and companion tools for speaker diarization and text-to-speech. Supports model customization through fine-tuning workflows and deployment to HuggingFace repositories for easy distribution across projects.
About whisper-web
xenova/whisper-web
ML-powered speech recognition directly in your browser
Leverages Transformers.js to run OpenAI's Whisper model entirely client-side without server dependencies. Supports multiple languages and includes experimental WebGPU acceleration for GPU-backed inference. Built as a web application deployable on static hosting, with Web Worker support for non-blocking audio processing.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work