whisperX and docker-whisperX

WhisperX is the core ASR and diarization library, while the Docker image is a containerized distribution mechanism for easier deployment—they are complements that work together, with the Dockerfile packaging the original tool for users who prefer containerized environments.

whisperX
90
Verified
docker-whisperX
57
Established
Maintenance 20/25
Adoption 25/25
Maturity 25/25
Community 20/25
Maintenance 13/25
Adoption 10/25
Maturity 16/25
Community 18/25
Stars: 20,758
Forks: 2,188
Downloads: 864,629
Commits (30d): 15
Language: Python
License: BSD-2-Clause
Stars: 422
Forks: 49
Downloads: —
Commits (30d): 0
Language: Dockerfile
License: MIT
No risk flags
No Package No Dependents

About whisperX

m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Builds on OpenAI's Whisper by combining faster-whisper for batched GPU inference (70x speedup) with wav2vec2 forced phoneme alignment to achieve sub-word timing accuracy. Integrates pyannote-audio for speaker diarization and includes VAD preprocessing to reduce hallucinations while maintaining quality. Supports multiple languages with automatic language-specific alignment model selection from HuggingFace and torchaudio.

About docker-whisperX

jim60105/docker-whisperX

Dockerfile for WhisperX: Automatic Speech Recognition with Word-Level Timestamps and Speaker Diarization (Dockerfile, CI image build and test)

Optimizes layer caching and parallel builds to efficiently manage 175 pre-built Docker images (~10GB each) on GitHub's free runners with weekly CI updates. Provides 40+ pre-baked model variants across languages (tiny to large-v3) alongside a `no_model` tag for custom model selection, with GPU acceleration support via NVIDIA Container Toolkit.

Scores updated daily from GitHub, PyPI, and npm data. How scores work