thewh1teagle/pyannote-rs

pyannote audio diarization in rust

/ 100

Established

Leverages a two-stage pipeline combining Pyannote's segmentation model (speech detection via sliding windows) with wespeaker embeddings for speaker identification, using ONNX Runtime for inference. Achieves CPU performance of ~1 hour audio per minute while supporting hardware acceleration via DirectML (Windows) and CoreML (macOS). Extracts audio features through knf-rs and performs speaker clustering using cosine similarity on speaker embeddings.

108 stars and 850 monthly downloads. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 16 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

108

Forks

Language

Rust

License

MIT

Related tools

shibing624/parrots

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine. 中英语音识别、多角色语音合成，支持多语言，准确率高

MainRo/deepspeech-server

A testing server for a speech to text service based on coqui.ai

altunenes/parakeet-rs

very fast speech-to-text, diarization, streaming (even in CPU) with NVIDIA Parakeet in Rust

PaddlePaddle/Parakeet

PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS,...

daanzu/deepspeech-websocket-server

Server & client for DeepSpeech using WebSockets for real-time speech recognition in separate environments

Explore Voice AI Tools

All categories Trending Voice AI directory Insights