thewh1teagle/pyannote-rs
pyannote audio diarization in rust
Leverages a two-stage pipeline combining Pyannote's segmentation model (speech detection via sliding windows) with wespeaker embeddings for speaker identification, using ONNX Runtime for inference. Achieves CPU performance of ~1 hour audio per minute while supporting hardware acceleration via DirectML (Windows) and CoreML (macOS). Extracts audio features through knf-rs and performs speaker clustering using cosine similarity on speaker embeddings.
108 stars and 850 monthly downloads. No commits in the last 6 months.
Stars
108
Forks
21
Language
Rust
License
MIT
Category
Last pushed
Sep 07, 2025
Monthly downloads
850
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/thewh1teagle/pyannote-rs"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
shibing624/parrots
Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine. 中英语音识别、多角色语音合成,支持多语言,准确率高
MainRo/deepspeech-server
A testing server for a speech to text service based on coqui.ai
altunenes/parakeet-rs
very fast speech-to-text, diarization, streaming (even in CPU) with NVIDIA Parakeet in Rust
PaddlePaddle/Parakeet
PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS,...
daanzu/deepspeech-websocket-server
Server & client for DeepSpeech using WebSockets for real-time speech recognition in separate environments