gpustack/vox-box

A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.

50
/ 100
Established

Provides flexible model sourcing through HuggingFace and ModelScope repositories with GPU acceleration via CUDA, enabling deployment across Linux, Windows, and macOS with configurable model sizes from tiny to large variants. Implements a stateless server architecture that auto-downloads and caches models, supporting both streaming (Paraformer-zh-streaming) and batch processing pipelines with CLI configuration for device binding and data directory management.

200 stars.

No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

200

Forks

32

Language

Python

License

Apache-2.0

Last pushed

Dec 23, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/gpustack/vox-box"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.