juanmc2005/diart

A python package to build AI-powered real-time audio applications

/ 100

Established

Leverages speaker segmentation and embedding models with incremental clustering for real-time speaker diarization that improves accuracy as conversations progress. Offers modular pipelines for voice activity detection and transcription, integrates pre-trained models from Hugging Face and Pyannote, and supports custom model integration via ONNX and PyTorch. Provides WebSocket support for web deployment and includes CLI tools for streaming from microphones or audio files.

1,944 stars. No commits in the last 6 months. Available on PyPI.

Stale 6m

Maintenance 0 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 19 / 25

How are scores calculated?

Stars

1,944

Forks

159

Language

Python

License

MIT

Related frameworks

felixbur/nkululeko

Machine learning speaker characteristics

claritychallenge/clarity

Clarity Challenge toolkit - software for building Clarity Challenge systems

astorfi/3D-convolutional-speaker-recognition

:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

wq2012/awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

hitachi-speech/EEND

End-to-End Neural Diarization

Explore ML Frameworks

All categories Trending ML Framework directory Insights