whisper-diarization and whisper-v3-diarization

These are **competitors** — both implement speaker diarization on top of Whisper for transcription with speaker identification, but A uses basic diarization while B adds WhisperX for improved accuracy and timing precision, making B a more feature-complete alternative.

whisper-diarization

Established

whisper-v3-diarization

Experimental

Maintenance 10/25

Adoption 10/25

Maturity 16/25

Community 20/25

Maintenance 6/25

Adoption 1/25

Maturity 9/25

Community 12/25

Stars: 5,437

Forks: 500

Downloads: —

Commits (30d): 0

Language: Jupyter Notebook

License: BSD-2-Clause

Stars: 1

Forks: 1

Downloads: —

Commits (30d): 0

Language: Python

License: MIT

No Package No Dependents

About whisper-diarization

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Combines Whisper with NVIDIA NeMo's voice activity detection and speaker embedding models (MarbleNet/TitaNet) to attribute transcribed text to individual speakers. Uses source separation (Demucs) for vocal extraction, CTC-forced alignment for precise timestamp correction, and punctuation-based realignment to compensate for temporal drift across segments. Outputs speaker-labeled transcriptions with segment-level timestamps, supporting configurable Whisper models and parallel inference modes for systems with sufficient VRAM.

About whisper-v3-diarization

TharanaBope/whisper-v3-diarization

Production-ready audio transcription & speaker diarization CLI & GUI using OpenAI Whisper and WhisperX

Related comparisons

whisper-diarization and whisperX whisper-diarization and whisper-run whisper-diarization and whisperX whisper-diarization and whisply whisper-diarization and whisper-run

Scores updated daily from GitHub, PyPI, and npm data. How scores work