rtk-ai/vox

A universal AI toolkit for high-performance Speech-to-Text (STT) and Text-to-Speech (TTS) processing, designed for low-latency and easy model integration.

34
/ 100
Emerging

Supports five pluggable TTS backends (macOS `say`, ONNX-based `kokoro`, Rust/Candle `qwen-native`, PyTorch `voxtream`, and MLX `qwen`) with zero-shot voice cloning on three of them, achieving 2–3s warm latency on Apple Silicon and 19s on CUDA. Built in Rust with Python interop, exposes a daemon mode for persistent model loading, and integrates as an MCP server or CLI tool into 14+ AI coding assistants (Claude Code, Cursor, VS Code, Zed). Includes SQLite state tracking, interactive TUI configuration, and voice recording/cloning workflows entirely offline.

No Package No Dependents
Maintenance 13 / 25
Adoption 7 / 25
Maturity 11 / 25
Community 3 / 25

How are scores calculated?

Stars

36

Forks

1

Language

Rust

License

Last pushed

Mar 07, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/rtk-ai/vox"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.