whisperX and CrisperWhisper

WhisperX provides the foundational diarization and word-level timestamping infrastructure that CrisperWhisper builds upon, making them complements rather than competitors—CrisperWhisper adds filler detection refinements to WhisperX's base output.

whisperX
90
Verified
CrisperWhisper
43
Emerging
Maintenance 20/25
Adoption 25/25
Maturity 25/25
Community 20/25
Maintenance 2/25
Adoption 10/25
Maturity 16/25
Community 15/25
Stars: 20,758
Forks: 2,188
Downloads: 864,629
Commits (30d): 15
Language: Python
License: BSD-2-Clause
Stars: 927
Forks: 48
Downloads:
Commits (30d): 0
Language: Python
License:
No risk flags
Stale 6m No Package No Dependents

About whisperX

m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Builds on OpenAI's Whisper by combining faster-whisper for batched GPU inference (70x speedup) with wav2vec2 forced phoneme alignment to achieve sub-word timing accuracy. Integrates pyannote-audio for speaker diarization and includes VAD preprocessing to reduce hallucinations while maintaining quality. Supports multiple languages with automatic language-specific alignment model selection from HuggingFace and torchaudio.

About CrisperWhisper

nyrahealth/CrisperWhisper

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

Built on OpenAI's Whisper, CrisperWhisper employs a custom tokenizer and attention loss mechanism during training to achieve precise word-level timestamp alignment, particularly around disfluencies and pauses. It integrates seamlessly with both 🤗 Transformers and Faster Whisper pipelines, enabling deployment in existing speech recognition workflows. The model prioritizes verbatim transcription including fillers ("um", "uh") and speech artifacts, ranking first on the OpenASR Leaderboard for verbatim datasets like TED-LIUM and AMI.

Scores updated daily from GitHub, PyPI, and npm data. How scores work