whisperX and whisply

WhisperX provides the underlying speech recognition and diarization engine with word-level timestamps, while Whisply is a higher-level application layer that wraps Whisper (and potentially WhisperX) to deliver batch processing and user interface functionality—making them complements rather than direct competitors.

whisperX
90
Verified
whisply
69
Established
Maintenance 20/25
Adoption 25/25
Maturity 25/25
Community 20/25
Maintenance 13/25
Adoption 16/25
Maturity 25/25
Community 15/25
Stars: 20,758
Forks: 2,188
Downloads: 864,629
Commits (30d): 15
Language: Python
License: BSD-2-Clause
Stars: 108
Forks: 16
Downloads: 1,597
Commits (30d): 0
Language: Python
License: MIT
No risk flags
No risk flags

About whisperX

m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Builds on OpenAI's Whisper by combining faster-whisper for batched GPU inference (70x speedup) with wav2vec2 forced phoneme alignment to achieve sub-word timing accuracy. Integrates pyannote-audio for speaker diarization and includes VAD preprocessing to reduce hallucinations while maintaining quality. Supports multiple languages with automatic language-specific alignment model selection from HuggingFace and torchaudio.

About whisply

tsmdt/whisply

💬 Fast, cross-platform CLI and GUI for batch transcription, translation, speaker annotation and subtitle generation using OpenAI’s Whisper on CPU, Nvidia GPU and Apple MLX.

Leverages hardware-specific Whisper implementations (`faster-whisper` for CPUs/Nvidia, `mlx-whisper` for Apple Silicon) with automatic device detection, plus integrates `whisperX` and `pyannote` for word-level speaker diarization and customizable subtitle generation. Supports multiple export formats (JSON, SRT, VTT, HTML, RTTM) and batch processing via CLI, browser app, or config files for scalable transcription workflows.

Scores updated daily from GitHub, PyPI, and npm data. How scores work