mrtozner/vox
Local voice AI framework for Rust. Whisper + LLM + TTS with no cloud dependencies.
Pluggable VAD, STT, and TTS backends via trait-based architecture let you swap Whisper/Sherpa/streaming engines and TTS providers (Kokoro, Piper, Pocket, Chatterbox). Exposes both CLI, Rust/Python libraries, and HTTP/WebSocket APIs for real-time streaming transcription and synthesis. Auto-downloads models on first run with configurable backends—Silero VAD feeds audio to chosen STT, results flow through user callbacks, optionally triggering TTS playback.
Stars
19
Forks
4
Language
Rust
License
Apache-2.0
Category
Last pushed
Feb 17, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/mrtozner/vox"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
TrevorS/voxtral-mini-realtime-rs
Streaming speech recognition running natively and in the browser. A pure Rust implementation of...
izwi-ai/izwi
On-device AI engine for transcription, TTS, and voice workflows.
darkautism/sensevoice-rs
A Rust-based, SenseVoiceSmall
thewh1teagle/vad-rs
Speech detection using silero vad in Rust
0xPD33/sonori
Sonori is a fully local STT app for Linux (Wayland).